-
-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for CephFS volumes / sub-volumes #1023
Comments
For completeness sake, here are some commands to get a new CephFS volume and subvolume stuff up and running and how the final mount command might look like (I'm fumbling that out of my history, not guaranteed to be 100% accurate): ceph fs volume create volume-name
ceph fs subvolumegroup create volume-name subvolume-group-name
ceph fs subvolume create volume-name subvolume-name --group_name subvolume-group-name
# this will now spit out a path including the UUID of the subvolume:
ceph fs subvolume getpath volume-name subvolume-name --group_name subvolume-group-name
# then authorize a new client (syntax changes slightly in upcoming version)
ceph fs authorize volume-name client.client-name /volumes/subvolume-group-name/subvolume-name/e7c5cd0c-10fa-42e2-9d48-902544f13d07 rw
# which can be mounted like (fsid can be omitted if it is in ceph.conf, key will be read from keyring in /etc/ceph too):
mount -t ceph [email protected]=/volumes/subvolume-group-name/subvolume-name/e7c5cd0c-10fa-42e2-9d48-902544f13d07 /mnt |
Just a question: what is the use-case blocked? |
How does your storage configuration look like? |
My cluster configuration is:
Steps to create storage pool and deploy instances with sharing files using CephFS volumes
|
So far it does not look like you are using the |
Yes, you are correct. This why I asked about your use-case. |
Ah, I see. If I were to automate Incus cluster deployment (or even just deployment for individual consumers of CephFS, and also want to handle Incus in the same way), I could instead use the Restful API module of the MGR for many operations in a way that is much less error prone than the API is for managing CephFS otherwise; I wouldn't need to create individual directory trees, and I would not have to enforce a certain convention for how the trees are laid out (since volumes have their very specific layout). Quota management also becomes less of a "have to write xattr of specific directory" and much more tightly attached to the subvolume. The combination of getpath and the way the auth management is handled also makes it a little harder to accidentally use the wrong path or something. This is mostly about automation and programmatically handling things, which is in line with what OpenStack Manila wants for its backend. Especially when administrating a Ceph cluster on a team with several admins however the added constraints make it much easier to work as a team since there are no strict conventions to stick to oneself, because Ceph already enforces those. Being able to create multiple volumes, each of which comes with its own pools and MDSs, also greatly improves how things work when you have to separate tenants for whatever reason. In short; it makes me not trip over my own feet when adding a new isolated filesystem share by taking care of the credential-management, directory creation, and quotas, something which I'd surely manage to at least once mess up and like.… delete the client.ceph credentials or something (which wouldn't be possible with the TL;DR: it's just more robust as soon as you need to have separate shares for different clients and makes managing the cluster easier if there is a strong separation of concerns. |
I appreciated your detailed explanation. |
I have run into this issue as well. Or at least the old vs new syntax. My ceph.conf uses DNS for monitor addresses which incus is trying to pass on without resolving:
This also generates a kernel error message:
It would seem the sensible solution is to use the new syntax and not attempt to parse the ceph config file. |
I did a bit of poking into the Incus and Ceph code, it seems that the mount.ceph CLI helper does a lot more lifting that I thought it did. I think I could probably make it better, and open a PR. However it would involve either shelling out to mount.ceph on the host or pulling in go-ceph to do most of the work. Perhaps @stgraber could provide some insight on a preferred approach? |
Using Last I checked, |
To the shell it is then! I didn't realize the nuance regarding go-ceph, I don't suppose there's a mechanism for external storage drivers in the works? That would perhaps offer the best of both worlds. |
Not currently. Storage drivers are pretty much constantly hammered, so running those out of process would basically require them running constantly. We also generally try to avoid plugin mechanisms as much as possible in favor of high quality 1st party integration. The problem with starting to have plugin APIs, even if only meant for internal components is that folks will almost immediately start (ab)using them for other stuff and get mad when we then break them ;) |
It's not much but it's a start. That it compiles at all is pretty neat given that I taught myself just enough go to draft it. edit: even better! it worked!! |
Nice! |
I would also like to hear from @benaryorg as I don't believe their use case is covered by mine or CI. This is their "issue" after all. Would you/they be willing to try the draft PR and see if/where it fails? |
Given that I'm sorta stuck on LTS and that I'm not sure that I would be affected by irreversible schema changes, I'll have to see if I can spin up a test for this outside my production infrastructure. If the patch were to apply cleanly on the current version I use (6.0.2) I could test a patched version without all that hassle, I just haven't tested that yet (so this is more of a note to myself than anything). |
Given that the most current iteration seems to have not just broken the Ceph tests but all the tests, I'm gonna say it's probably not going to apply cleanly. Probably shouldn't be anywhere near production infra either. I'm trying to keep breaking changes to a bare minimum but one way or another I think somethings going to break, 1172 seems applicable to this situation. |
@benaryorg You may appreciate this recent work on my branch and now draft PR: michael ~> ceph fs ls
name: cephfs, metadata pool: cephfs.meta, data pools: [cephfs.data ]
name: test-fs, metadata pool: cephfs.test-fs.meta, data pools: [cephfs.test-fs.data ]
michael ~> incus storage create ceph2 cephfs source=test-fs/
Storage pool ceph2 created
michael ~> incus storage show ceph2
config:
cephfs.cluster_name: ceph
cephfs.path: test-fs/
cephfs.user.name: admin
source: test-fs/
description: ""
name: ceph2
driver: cephfs
used_by: []
status: Created
locations:
- none
michael ~> ceph fs subvolume create test-fs test-subvol
michael ~> ceph fs subvolume getpath test-fs test-subvol
/volumes/_nogroup/test-subvol/81e9f4f7-e108-4f8e-8d28-15f0737b1262
michael ~> incus storage create ceph2-subvol cephfs source=test-fs/volumes/_nogroup/test-subvol/81e9f4f7-e108-4f8e-8d28-15f0737b1262
Storage pool ceph2-subvol created Still some cleanup work to do, however it is currently testing quite nicely. |
Required information
Issue description
CephFS has changed its mount string in Quincy, the version that has recently reached its estimated EoL date (current being Reef, Squid is upcoming AFAIK).
This means that any still active release (talking about upstream, not distros) has a mount string that is different from the one Incus is using right now.
This leads to users having a really hard time trying to mount CephFS created via the newer CephFS Volumes/Subvolumes mechanic (at least I haven't gotten it working yet).
As described in the discussion boards the old syntax was:
and a lot of options via the
-o
parameter (or the appropriate field in the mount syscall).Notably Incus does not rely on the config file for this but manually scrapes the mon addresses out of the config file (which has its own issues because the used string matching is insufficient to catch an initial mon list which then refers to the mons by name and the mons being listed in their own sections with their addresses directly as
mon_addr
, which means that whilemount.ceph
can just mount the volume, Incus fails during the parsing phase of the config file.The new syntax is:
So with the user, the (optional) fsid, and the cephfs name being encoded into the string there are a few less options, although they do still exist.
Steps to reproduce
With vaguely correct seeming parameters provided to Incus this will still lead to interesting issues like getting No Route to Host errors despite everything being reachable.
Honestly, if you find options that manage to mount that, please tell me because I can't seem to find any.
Information to attach
Any relevant kernel output (dmesg)
Main daemon log (at /var/log/incus/incusd.log)
Container log (incus info NAME --show-log
)Container configuration (incus config show NAME --expanded
)Output of the client with --debugOutput of the daemon with --debug (alternatively output of(doesn't really log anything about the issue)incus monitor --pretty
while reproducing the issue)The text was updated successfully, but these errors were encountered: