Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inability to access data from private repos over http; could there be 401 and not 404 #111

Open
yarikoptic opened this issue Jan 21, 2021 · 5 comments

Comments

@yarikoptic
Copy link
Contributor

Describe the bug

I feel like we had some issue filed somewhere but may be it was just a discussion or I have just failed to find it. May be I had in mind just the report/discussion I had with Joey (git-annex) on nearby issue (and where I mentioned gin as future use case).

It seems that gogs (similar to github and somewhat to gitlab), to not reveal presence of a private repository, just sends over there is no correct http response from gogs whenever a request is sent to private repo's /config over http:

$> wget -S https://gin.g-node.org/SakshamSharda/ophys_testing1.git/config
--2021-01-20 21:50:40--  https://gin.g-node.org/SakshamSharda/ophys_testing1.git/config
Resolving gin.g-node.org (gin.g-node.org)... 141.84.41.219
Connecting to gin.g-node.org (gin.g-node.org)|141.84.41.219|:443... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 404 Not Found
  Date: Thu, 21 Jan 2021 02:50:40 GMT
  Server: Apache/2.4.38 (Debian)
  content-type: text/html; charset=UTF-8
  set-cookie: lang=en-US; Path=/; Max-Age=2147483647
  set-cookie: gnode_gin=6b6b51f4d8044711; Path=/; HttpOnly
  set-cookie: _csrf=RhXjLEJiVRdixQmiAp_JOi9p1K46MTYxMTE5NzQ0MDk1MTg1MDM2OQ; Path=/; Expires=Fri, 22 Jan 2021 02:50:40 GMT
  Keep-Alive: timeout=5, max=100
  Connection: Keep-Alive
  Transfer-Encoding: chunked
2021-01-20 21:50:41 ERROR 404: Not Found.

So, instead of receiving 401 which would instruct client that authentication is needed, client would need to have gin specific knowledge that 404 might also mean 401 really, and then try to authenticate one way or another. That brings me back to the discussion with Joey I had mentioned above and in particular his "final comment" that he would unlikely to implement some "let's try this and that" ;-)

I would say, that unlike in case of github where may be someone would may be distill some commercially useful secret of someone having or not some private repository, I think in the case of gin it would make much more sense to provide a more informative response (401 not 404) in such cases. That hopefully might make git-annex immediately (since Joey did address that issue I reported awhile back) usable with gin for private repos over http.

WDYT?

Meanwhile -- here are some more details since you have asked ;)
Git version

$> git version
git version 2.29.2

$> git annex version
git-annex version: 8.20201127+git54-ga1b227171-1~ndall+1
build flags: Assistant Webapp Pairing Inotify DBus DesktopNotify TorrentParser MagicMime Feeds Testsuite S3 WebDAV
dependency versions: aws-0.20 bloomfilter-2.0.1.0 cryptonite-0.25 DAV-1.3.3 feed-1.0.0.0 ghc-8.4.4 http-client-0.5.13.1 persistent-sqlite-2.8.2 torrent-10000.1.1 uuid-1.3.13 yesod-1.6.0
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL X*
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs httpalso hook external
operating system: linux x86_64
supported repository versions: 8
upgrade supported from repository versions: 0 1 2 3 4 5 6 7

Operating system

Debian GNU/Linux testing/sid mix

Database

have no clue since do not use any ;)

To Reproduce

$> git clone https://gin.g-node.org/SakshamSharda/ophys_testing1.git ; cd ophys_testing1
Cloning into 'ophys_testing1'...
remote: Enumerating objects: 818, done.
remote: Counting objects: 100% (818/818), done.
remote: Compressing objects: 100% (392/392), done.
remote: Total 818 (delta 178), reused 708 (delta 176)
Receiving objects: 100% (818/818), 58.33 KiB | 138.00 KiB/s, done.
Resolving deltas: 100% (178/178), done.
upon initial get git-annex does not really disclose us why it sets annex-ignore for the origin
$> git annex get --debug segmentation_datasets/suite2p/combined/ops.npy                 
[2021-01-20 21:26:17.832995281] process [2560941] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","git-annex"]
[2021-01-20 21:26:17.835696512] process [2560941] done ExitSuccess
[2021-01-20 21:26:17.835898389] process [2560942] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","--hash","refs/heads/git-annex"]
[2021-01-20 21:26:17.839431904] process [2560942] done ExitFailure 1
[2021-01-20 21:26:17.840032641] process [2560943] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","--verify","-q","origin/git-annex"]
[2021-01-20 21:26:17.843650202] process [2560943] done ExitFailure 1
[2021-01-20 21:26:17.844408802] process [2560944] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","write-tree"]
[2021-01-20 21:26:17.847633523] process [2560944] done ExitSuccess
[2021-01-20 21:26:17.857855842] process [2560945] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","commit-tree","4b825dc642cb6eb9a060e54bf8d69288fbee4904","--no-gpg-sign"]
[2021-01-20 21:26:17.862599666] process [2560945] done ExitSuccess
[2021-01-20 21:26:17.863255108] process [2560946] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","update-ref","refs/heads/git-annex","db0b4f936a014c93ec07157b22a337e6c6ad6725"]
[2021-01-20 21:26:17.868570495] process [2560946] done ExitSuccess
[2021-01-20 21:26:17.870083456] process [2560947] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","config","annex.uuid","440ceeae-c98d-443f-b5ea-b7f94e56bc36"]
[2021-01-20 21:26:17.875687417] process [2560947] done ExitSuccess
[2021-01-20 21:26:17.876498569] process [2560948] read: git ["config","--null","--list"]
[2021-01-20 21:26:17.881621746] process [2560948] done ExitSuccess
[2021-01-20 21:26:17.882959388] process [2560949] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","git-annex"]
[2021-01-20 21:26:17.886413441] process [2560949] done ExitSuccess
[2021-01-20 21:26:17.886765077] process [2560950] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","--hash","refs/heads/git-annex"]
[2021-01-20 21:26:17.891487585] process [2560950] done ExitSuccess
[2021-01-20 21:26:17.892499777] process [2560951] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","log","refs/heads/git-annex..db0b4f936a014c93ec07157b22a337e6c6ad6725","--pretty=%H","-n1"]
[2021-01-20 21:26:17.900518989] process [2560951] done ExitSuccess
[2021-01-20 21:26:17.901142963] process [2560952] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","log","refs/heads/git-annex..650dd921c63604abbdc7010e954af2159dc00e2c","--pretty=%H","-n1"]
[2021-01-20 21:26:17.906373128] process [2560952] done ExitSuccess
[2021-01-20 21:26:17.9070611] process [2560953] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch"]
[2021-01-20 21:26:17.907347101] process [2560954] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch-check=%(objectname) %(objecttype) %(objectsize)"]
(merging origin/git-annex into git-annex...)
[2021-01-20 21:26:17.910935248] process [2560955] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","hash-object","-w","--stdin-paths","--no-filters"]
[2021-01-20 21:26:17.911524844] process [2560956] feed: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","update-index","-z","--index-info"]
[2021-01-20 21:26:17.912230004] process [2560957] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","diff-index","--raw","-z","-r","--no-renames","-l0","--cached","650dd921c63604abbdc7010e954af2159dc00e2c","--"]
[2021-01-20 21:26:17.91767627] process [2560957] done ExitSuccess
[2021-01-20 21:26:17.919775365] process [2560956] done ExitSuccess
[2021-01-20 21:26:17.92089967] process [2560958] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","log","650dd921c63604abbdc7010e954af2159dc00e2c..refs/heads/git-annex","--pretty=%H","-n1"]
[2021-01-20 21:26:17.924247263] process [2560958] done ExitSuccess
(recording state in git...)
[2021-01-20 21:26:17.924630647] process [2560959] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","write-tree"]
[2021-01-20 21:26:17.92898454] process [2560959] done ExitSuccess
[2021-01-20 21:26:17.929434572] process [2560960] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","commit-tree","ed705e04954a4a569d968820d9d3f061e1b3bed0","--no-gpg-sign","-p","refs/heads/git-annex","-p","650dd921c63604abbdc7010e954af2159dc00e2c"]
[2021-01-20 21:26:17.93303884] process [2560960] done ExitSuccess
[2021-01-20 21:26:17.933449988] process [2560961] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","update-ref","refs/heads/git-annex","a05c85926f615be0ec9c53cbebb8a66211f28e69"]
[2021-01-20 21:26:17.936652871] process [2560961] done ExitSuccess
[2021-01-20 21:26:17.93754598] process [2560962] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","config","annex.version","8"]
[2021-01-20 21:26:17.940690537] process [2560962] done ExitSuccess
[2021-01-20 21:26:17.940981966] process [2560963] read: git ["config","--null","--list"]
[2021-01-20 21:26:17.944104344] process [2560963] done ExitSuccess
[2021-01-20 21:26:17.944401706] process [2560964] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","status","--porcelain"]
[2021-01-20 21:26:17.948854902] process [2560964] done ExitSuccess
[2021-01-20 21:26:17.949229329] process [2560965] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","config","filter.annex.smudge","git-annex smudge -- %f"]
[2021-01-20 21:26:17.953734805] process [2560965] done ExitSuccess
[2021-01-20 21:26:17.954137664] process [2560966] read: git ["config","--null","--list"]
[2021-01-20 21:26:17.960850674] process [2560966] done ExitSuccess
[2021-01-20 21:26:17.961363592] process [2560967] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","config","filter.annex.clean","git-annex smudge --clean -- %f"]
[2021-01-20 21:26:17.964468183] process [2560967] done ExitSuccess
[2021-01-20 21:26:17.964722435] process [2560968] read: git ["config","--null","--list"]
[2021-01-20 21:26:17.967956779] process [2560968] done ExitSuccess
(scanning for unlocked files...)
[2021-01-20 21:26:17.968435052] process [2560969] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","--head"]
[2021-01-20 21:26:17.975271574] process [2560969] done ExitSuccess
[2021-01-20 21:26:17.976461371] process [2560970] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","ls-tree","--full-tree","-z","-r","--","HEAD"]
[2021-01-20 21:26:17.981553903] process [2560971] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch"]
[2021-01-20 21:26:17.982032827] process [2560972] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch-check=%(objectname) %(objecttype) %(objectsize)"]
[2021-01-20 21:26:17.987257632] process [2560970] done ExitSuccess
[2021-01-20 21:26:17.988026111] process [2560973] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","symbolic-ref","-q","HEAD"]
[2021-01-20 21:26:17.994583221] process [2560973] done ExitSuccess
[2021-01-20 21:26:17.995610442] process [2560974] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","refs/heads/master"]
[2021-01-20 21:26:17.999086705] process [2560974] done ExitSuccess
[2021-01-20 21:26:17.999584354] process [2560975] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","symbolic-ref","-q","HEAD"]
[2021-01-20 21:26:18.002735711] process [2560975] done ExitSuccess
[2021-01-20 21:26:18.003693557] process [2560976] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","--hash","refs/heads/master"]
[2021-01-20 21:26:18.011399384] process [2560976] done ExitSuccess
[2021-01-20 21:26:18.01189873] process [2560977] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","checkout","-q","-B","master"]
[2021-01-20 21:26:18.03948708] process [2560977] done ExitSuccess
[2021-01-20 21:26:18.040531497] process [2560986] read: uname ["-n"]
[2021-01-20 21:26:18.043076734] process [2560986] done ExitSuccess
[2021-01-20 21:26:18.043999768] process [2560987] chat: /usr/lib/git-annex.linux/bin/git-annex ["init","--autoenable"]
[2021-01-20 21:26:19.443974995] process [2560987] done ExitSuccess
[2021-01-20 21:26:19.444789989] process [2561006] read: git ["config","--null","--list"]
[2021-01-20 21:26:19.458780238] process [2561006] done ExitSuccess
[2021-01-20 21:26:19.459752234] process [2561007] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","symbolic-ref","-q","HEAD"]
[2021-01-20 21:26:19.474310994] process [2561007] done ExitSuccess
[2021-01-20 21:26:19.475161273] process [2561008] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","refs/heads/master"]
[2021-01-20 21:26:19.488243614] process [2561008] done ExitSuccess
[2021-01-20 21:26:19.489481126] process [2561009] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","ls-files","--stage","-z","--","segmentation_datasets/suite2p/combined/ops.npy"]
[2021-01-20 21:26:19.49062944] process [2561010] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch-check=%(objectname) %(objecttype) %(objectsize)","--buffer"]
[2021-01-20 21:26:19.491410265] process [2561011] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch=%(objectname) %(objecttype) %(objectsize)","--buffer"]
[2021-01-20 21:26:19.494393894] process [2561012] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch=%(objectname) %(objecttype) %(objectsize)","--buffer"]
get segmentation_datasets/suite2p/combined/ops.npy [2021-01-20 21:26:19.504088286] process [2561013] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch"]
[2021-01-20 21:26:19.504747549] process [2561014] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch-check=%(objectname) %(objecttype) %(objectsize)"]
(not available) 
  Maybe add some of these git remotes (git remote add ...):
  	032e9910-2e9e-410d-a4cc-739090e82854 -- Saksham@Saksham:~/Documents\NWB\gittest\DataLadOphysNew
   	3d641067-728f-47ff-83fd-3c2db9cc2665 -- git@8242caf9acd8:/data/repos/sakshamsharda/ophys_testing1.git
   	53b7df4a-a525-4c5f-9c45-51c52b94cf79 -- git@8242caf9acd8:/data/repos/sakshamsharda/ophys_testing1.git

  (Note that these git remotes have annex-ignore set: origin)
failed
[2021-01-20 21:26:19.515900214] process [2561013] done ExitSuccess
[2021-01-20 21:26:19.517528967] process [2561014] done ExitSuccess
[2021-01-20 21:26:19.518837283] process [2560955] done ExitSuccess
[2021-01-20 21:26:19.519306927] process [2561012] done ExitSuccess
[2021-01-20 21:26:19.519484528] process done ExitSuccess
[2021-01-20 21:26:19.51957606] process done ExitSuccess
[2021-01-20 21:26:19.519716133] process [2561011] done ExitSuccess
[2021-01-20 21:26:19.519813453] process [2561010] done ExitSuccess
[2021-01-20 21:26:19.519933777] process [2561009] done ExitSuccess
[2021-01-20 21:26:19.520497572] process done ExitSuccess
git-annex: get: 1 failed

But if I reset that config and retry, it becomes a bit more obvious:

$> git annex get --debug segmentation_datasets/suite2p/combined/ops.npy                 
...
[2021-01-20 21:28:12.201047763] process [2561638] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch-check=%(objectname) %(objecttype) %(objectsize)"]
[2021-01-20 21:28:12.242170299] Request {
  host                 = "gin.g-node.org"
  port                 = 443
  secure               = True
  requestHeaders       = [("Accept-Encoding","identity"),("User-Agent","git-annex/8.20201127+git54-ga1b227171-1~ndall+1")]
  path                 = "/SakshamSharda/ophys_testing1.git/config"
  queryString          = ""
  method               = "GET"
  proxy                = Nothing
  rawBody              = False
  redirectCount        = 10
  responseTimeout      = ResponseTimeoutDefault
  requestVersion       = HTTP/1.1
}


  Remote origin not usable by git-annex; setting annex-ignore
...

although also a bit strange since it is not 404 but some time out...

@yarikoptic
Copy link
Contributor Author

well, I slept on it for a few minutes, and I think it might actually be something for annex to adjust in its treatment of 404 to be more inline with git: https://git-annex.branchable.com/bugs/be_like_git_and_ask_for_credentials_if_404/ . Let's see what Joey thinks.

@yarikoptic
Copy link
Contributor Author

see more discussion/details in aforementioned git-annex issue. So far the conclusion is: it should be up to gogs/gin to return proper 401 and not 404. github apparently returns 401 whenever user agent is set to git/1... (note: 401 even in the case when repository does not exist thus not revealing presence/absence of private repos).

@achilleas-k
Copy link
Member

achilleas-k commented Jan 26, 2021

Might be worth asking for this (or offering it) upstream then. They're usually receptive to changes that bring it closer to other hosted git services.

@yarikoptic
Copy link
Contributor Author

IIRC it was our patching within gin's gogs to do expose /config so upstream might not have that at all. But sure thing would be nice to chat to upstream -- may be a more generic handling (e.g., always return 401 instead of 404 if repository might exist but not accessible to the user; return 404 whenever trying to access something within a public repo, or private to which access is granted) is feasible.

@achilleas-k
Copy link
Member

may be a more generic handling (e.g., always return 401 instead of 404 if repository might exist but not accessible to the user

yeah, this is what I was going for. 401 for unreachable repository paths for git to trigger logins like github does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants