-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Real AWS and Transfer-Encoding: chunked #139
Comments
Hi Max, Thanks for trying out MarFS. For writes through fuse, the ultimate size of the object isn't known ahead of time, so we have to use chunked transfer-encoding instead of providing an explicit content-length in the PUT headers. We tested against a different S3 implementation, which does support CTE. Based on the forum message you mentioned, it looks like we would have to add an 'x-amz-decoded-content-length' header. That sounds feasible, at first, but they also say:
[See http://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html] Maybe I don't understand, but that sounds like it defeats at least one purpose of CTE. Meanwhile, we have another tool that we use for copying files in parallel to MarFS. (https://github.com/pftool/pftool/tree/cpp, make sure to use the "cpp" branch) In this case, we do know the final size of the destination, so we can skip using CTE. We haven't been testing with S3 repos in quite a while, but this was working many months ago. If you just wanted to copy files into MarFS, this might work for you. Thanks, On Jun 17, 2016, at 1:09 PM, Max Speransky [email protected]
|
Hi Jeff, Thank you for your reply. I tried to build 'cpp' branch of pftool but gave up quick :) look like that branch is under active development and not all library targets are under autoconf control. Nevertheless, I used fakes3 to simulate S3 and complete my test scenarios. Also testing produced two questions:
I see that in context of object_stream it's possible to do stat() call against source file pathname, that will give a size for PUT. Is this architecturally wrong ? |
Hi Max, pftool should build reliably, but needs some context. In the MarFS_Install document, there's an illustration of the (ugly) invocation of "configure", to build pftool for MarFS. [Just noticed: you need to run './autogen' before configuring pftool for the first time. I'll add that to the document.] (1) You can always just copy to/from a MarFS fuse mount using 'cp' or 'rsync'. (This isn't recommend for production systems, without some extra measures to protect against consequences of someone rooting the box.) You can also run multiple such copies in parallel. (2) pftool internally calls the libmarfs function marfs_open_at_offset(). This function shouldn't be used unless, like pftool, you have some understanding of MarFS chunking (different concept from chunked transfer-encoding). If you want to take on that responsibility, you could use that function to get MarFS to put content-lengths into PUT requests. If you want to go this route, you would probably first call get_chunksize(), giving it the total size of the file you intend to write. It will return a chunksize. Your calls to marfs_open_at_offset() should then always be at multiples of that chunksize, and the total amount of writes you do on each resulting file-handle should always add up to that chunksize (except perhaps in the final chunk, at the largest offset). Thanks, On Jun 21, 2016, at 4:32 AM, Max Speransky [email protected]
|
I'm trying to prototype a Mar-FS with S3 AWS and got something which cannot explain. So, my repo definition in config is very simple:
What is see in debug log, when I try to copy file to /marfs/..., marfs_fuse tries to use chunked upload, but S3 doesn't support it (https://forums.aws.amazon.com/message.jspa?messageID=561616), and i'm getting 501 Not Implemented:
DEBUG=3 MARFSCONFIGRC=/home/vagrant/marfs.cfg ./marfs_fuse /marfs -d -f
DBG: Reading Config File ID[root]
DBG: Config File /root/.awsAuth
FUSE library version: 2.9.2
nullpath_ok: 0
nopath: 0
utime_omit_ok: 0
unique: 1, opcode: INIT (26), nodeid: 0, insize: 56, pid: 0
INIT: 7.22
flags=0x0000f7fb
max_readahead=0x00020000
INIT: 7.19
flags=0x00000020
max_readahead=0x00020000
max_write=0x08000000
max_background=0
congestion_threshold=0
unique: 1, success, outsize: 40
unique: 2, opcode: LOOKUP (1), nodeid: 1, insize: 47, pid: 15331
LOOKUP /source
getattr /source
NODEID: 2
unique: 2, success, outsize: 144
unique: 3, opcode: LOOKUP (1), nodeid: 2, insize: 53, pid: 15331
LOOKUP /source/testfile.bin
getattr /source/testfile.bin
NODEID: 3
unique: 3, success, outsize: 144
unique: 4, opcode: OPEN (14), nodeid: 3, insize: 48, pid: 15331
open flags: 0x8001 /source/testfile.bin
here, without content-length
open[140693330726592] flags: 0x8001 /source/testfile.bin
unique: 4, success, outsize: 32
unique: 5, opcode: SETATTR (4), nodeid: 3, insize: 128, pid: 15331
truncate /source/testfile.bin 0
getattr /source/testfile.bin
unique: 5, success, outsize: 120
unique: 6, opcode: REMOVEXATTR (24), nodeid: 3, insize: 53, pid: 15331
removexattr /source/testfile.bin security.ima
unique: 6, error: -61 (No data available), outsize: 16
unique: 7, opcode: GETXATTR (22), nodeid: 3, insize: 68, pid: 15331
getxattr /source/testfile.bin security.capability 0
DBG: Request Time: Fri, 17 Jun 2016 16:26:31 +0000
DBG: StrToSign:
PUT
Fri, 17 Jun 2016 16:26:31 +0000
/xxx/arc/ver.001_003/ns.admins/F___/inode.0000001138/md_ctime.20160617_162538+0000_0/obj_ctime.20160617_162631+0000_0/unq.0/chnksz.80000000/chnkno.0
DBG: Signature: n7W5l5qBlEqjCssP7qZOmoRLaFU=
DBG: aws_curl_enter: 'aws4c.c', line 2084
unique: 7, error: -61 (No data available), outsize: 16
unique: 8, opcode: WRITE (16), nodeid: 3, insize: 131152, pid: 15331
write[140693330726592] 131072 bytes to 0 flags: 0x8001
< HTTP/1.1 501 Not Implemented
< x-amz-request-id: E9B2C441E4092C6D
< x-amz-id-2: 7vs5C6DC9/RT+zzzzz=
< Content-Type: application/xml
< Transfer-Encoding: chunked
< Date: Fri, 17 Jun 2016 16:32:56 GMT
< Connection: close
Server AmazonS3 is not blacklisted
< Server: AmazonS3
<
Closing connection 0
DBG: Return Code: 0
DBG: aws_curl_exit: 'aws4c.c', line 2270
unique: 8, error: -110 (Connection timed out), outsize: 16
Transfer-Encodingunique: 9, opcode: FLUSH (25), nodeid: 3, insize: 64, pid: 15331
unique: 9, error: -38 (Function not implemented), outsize: 16
unique: 10, opcode: RELEASE (18), nodeid: 3, insize: 64, pid: 0
release[140693330726592] flags: 0x8001
unique: 10, success, outsize: 16
NotImplemented
A header you provided implies functionality that isnot implemented
E9B2C441E4092C6D
Is it a configuration issue or real S3 AWS doesn't supported ?
The text was updated successfully, but these errors were encountered: