Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] engine.share.level=GROUP takes only first AD Group if the user is part of multiple AD Groups #6402

Open
2 of 4 tasks
avishnus opened this issue May 21, 2024 · 2 comments · May be fixed by #6779
Open
2 of 4 tasks
Labels
kind:bug This is a clearly a bug priority:major

Comments

@avishnus
Copy link
Contributor

avishnus commented May 21, 2024

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the bug

  1. If the user is part of multiple AD Groups and when engine.share.level is set to group, hadoop returns all groups that the user is part of but kyuubi takes the first group from that list.

  2. Then it expects that particular AD Group to be a valid YARN user and submits job as that user.

In the below error logs, out of all the AD Group the user is part of, it pick the first one i.e Internet_User and checks if that is a valid YARN user. If it is not, it throws the error

Expected Behaviour

We should be able to provide the list of AD Group kyuubi has to check if the user is part of. If found a match, then job should be launched as the user and not as that AD Group. Multiple users part of the same AD Group will have the same session, resources, etc but the job will be submitted by those users only

Affects Version(s)

1.8.2

Kyuubi Server Log Output

Failing this attempt.Diagnostics: [2024-05-06 09:14:47.685]Application application_1714974956360_0003 initialization failed (exitCode=255) with output: main : command provided 0
main : run as user is Internet_User
main : requested yarn user is Internet_User
User Internet_User not found

Kyuubi Engine Log Output

No response

Kyuubi Server Configurations

No response

Kyuubi Engine Configurations

No response

Additional context

No response

Are you willing to submit PR?

  • Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
  • No. I cannot submit a PR at this time.
@avishnus avishnus added kind:bug This is a clearly a bug priority:major labels May 21, 2024
@pan3793
Copy link
Member

pan3793 commented May 22, 2024

That sounds reasonable, and if you look at the code, it's easy to implement such a functionality.

https://github.com/apache/kyuubi/blob/v1.8.2/kyuubi-server/src/main/scala/org/apache/kyuubi/session/HadoopGroupProvider.scala#L32

For example, we can define a kyuubi.session.preferGroup, when it is present, we chose it instead of the first group as the user's primary group, or deny if the preferGroup is not valid.

Madhukar525722 added a commit to Madhukar525722/kyuubi that referenced this issue Oct 23, 2024
Madhukar525722 added a commit to Madhukar525722/kyuubi that referenced this issue Oct 23, 2024
Madhukar525722 added a commit to Madhukar525722/kyuubi that referenced this issue Oct 25, 2024
Madhukar525722 added a commit to Madhukar525722/kyuubi that referenced this issue Oct 25, 2024
Madhukar525722 added a commit to Madhukar525722/kyuubi that referenced this issue Oct 26, 2024
Madhukar525722 added a commit to Madhukar525722/kyuubi that referenced this issue Oct 26, 2024
Madhukar525722 added a commit to Madhukar525722/kyuubi that referenced this issue Oct 29, 2024
Madhukar525722 added a commit to Madhukar525722/kyuubi that referenced this issue Nov 1, 2024
@A-little-bit-of-data
Copy link

A-little-bit-of-data commented Jan 7, 2025

That sounds reasonable, and if you look at the code, it's easy to implement such a functionality.

https://github.com/apache/kyuubi/blob/v1.8.2/kyuubi-server/src/main/scala/org/apache/kyuubi/session/HadoopGroupProvider.scala#L32

For example, we can define a kyuubi.session.preferGroup, when it is present, we chose it instead of the first group as the user's primary group, or deny if the preferGroup is not valid.

Hello, the changes here are good, but I have another question. If I set it through kyuubi.session.preferGroup, which is controlled globally in the kyuubi-defaults.conf file, what if I need multiple users, such as starting multiple spark SQLs?

## User provided Kyuubi configurations
___trino___.spark.app.name=trino
___trino___.spark.executor.instances=2
___trino___.spark.driver.cores=1
___trino___.spark.executor.cores=2
___trino___.spark.kubernetes.driver.limit.cores=2 
___trino___.spark.kubernetes.executor.limit.cores=4
 ___trino___.spark.driver.memory=1g
 ___trino___.spark.executor.memory=4g

 ___admin___.spark.app.name=sparksql-admin
 ___admin___.spark.executor.instances=1 
___admin___.spark.driver.cores=1
 ___admin___.spark.executor.cores=1
 ___admin___.spark.kubernetes.driver.limit.cores=1
 ___admin___.spark.kubernetes.executor.limit.cores=1
 ___admin___.spark.driver.memory=1g 
___admin___.spark.executor.memory=1g

 ___test___.spark.app.name=sparksql-test
 ___test___.spark.executor.instances=1 
___test___.spark.driver.cores=1
 ___test___.spark.executor.cores=1
 ___test___.spark.kubernetes.driver.limit.cores=1
 ___test___.spark.kubernetes.executor.limit.cores=1
 ___test___.spark.driver.memory=1g 
___test___.spark.executor.memory=1g

For example, there are three groups, admin 、 trino and test, to start the spark SQL engine. Now there is xiaoming who belongs to the groups admin, trino, and test. How can I specify which group's resources the user xiaoming should use when starting the spark SQL engine?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug This is a clearly a bug priority:major
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants