Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance Security Practices for HanLP Based on OpenSSF Scorecard #1931

Merged
merged 3 commits into from
Dec 18, 2024

Conversation

Fix3dP0int
Copy link

Enhance Security Practices for HanLP Based on OpenSSF Scorecard

Description

Dear HanLP Maintainers,
Firstly, I want to express my sincere appreciation for your contributions to HanLP. It’s inspiring to see how this project has grown and made such a significant impact within the open-source community.

The OpenSSF Scorecard is an automated tool that evaluates the security practices of open-source projects based on a variety of criteria. It provides a security score and actionable insights that help improve project safety and risk management.

While reviewing HanLP using the OpenSSF Scorecard, I identified two potential risks that could improve the overall security posture of the project. These are related to Token-Permissions and Pinned-Dependencies, both of which are important for maintaining the integrity and security of the project's CI/CD pipeline.

Issues Identified:

Token Permissions:
In the current workflow configuration, the unit-tests workflow appears to lack explicitly defined permissions. This can be a security risk, as workflows may have broader access than necessary. It is considered best practice to define the exact permissions required for each workflow or job in the permissions field of the GitHub Actions configuration. This will limit access to only what’s needed, reducing the risk of over-permissioning, which can lead to security vulnerabilities.
Pinned Dependencies:
Another area for improvement is how third-party libraries are referenced within the workflow configuration. Some dependencies currently use the @v4 tag, which points to an evolving version of the library. This can introduce issues if the library is updated with breaking changes or vulnerabilities. The OpenSSF scorecard recommends using specific commit hashes for dependencies, which locks the version and ensures that only known secure versions are used. This will prevent unintentional upgrades that might introduce new risks.
My PR:
To address the issues mentioned above, I have submitted a pull request with the following changes:

  • Explicit permissions added to each workflow and job in the configuration files.
  • Dependencies pinned by replacing @v4 references with specific commit hashes for third-party libraries.
  • TODO: replace pip install somepackage by pip install -r requirements.txt --require-hashes. (Please tell me the exact version of the dependencies, I would be happy to complete it.)

Additional Security Suggestions Based on Scorecard:

Branch Protection:
Enabling branch protection rules would add an additional layer of security by requiring status checks to pass before merging code into important branches (such as main or master). This ensures that only properly validated code is merged, reducing the risk of introducing vulnerabilities.
Static Analysis (SAST):
Integrating a Static Application Security Testing (SAST) tool, such as CodeQL in GitHub, into the CI pipeline could help detect vulnerabilities in the source code automatically. This proactive measure can identify and mitigate security issues before they escalate into problems.

Please feel free to review the PR at your convenience. I understand the tremendous effort required to maintain an open-source project, and I deeply appreciate the work you all put into HanLP. I’m happy to assist with any further improvements or clarifications.

Thank you again for your dedication to open source. I look forward to seeing HanLP continue to evolve!

Type of Change

Please check any relevant options and delete the rest.

  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How Has This Been Tested?

None.

Checklist

Check all items that apply.

  • ⚠️Changes must be made on dev branch instead of master
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have checked my code and corrected any misspellings

@Fix3dP0int Fix3dP0int changed the title PR to Enhance Security Practices for HanLP Based on OpenSSF Scorecard Enhance Security Practices for HanLP Based on OpenSSF Scorecard Dec 10, 2024
@hankcs
Copy link
Owner

hankcs commented Dec 18, 2024

Hi @Fix3dP0int ,

Thank you for bringing up the security issue. I agree with most of your proposals including:

  1. Restrict permission to read only
  2. Upgrade GitHub Actions

I've pushed these 2 changes into https://github.com/hankcs/HanLP/tree/workflow-patch and I'll merge it to dev soon.

I don't agree with the follows:

  1. Binding to the immutable full sha1. According to the official GitHub guideline, "the hosted images toolsets (e.g. ubuntu-latest) move forward and if there is a tool breaking issue, actions may react with a patch to a major version to compensate so binding to a specific SHA may prevent you from getting fixes."
  2. replace pip install somepackage by pip install -r requirements.txt --require-hashes. Versions of some dependencies don't work/only work on some platforms and they have to be specified. It would be tedious to maintain multiple requirements.txt for these platforms. E.g.,
if (sys_version_info.major, sys_version_info.minor) == (3, 6) and sys.platform in {'darwin', 'win32'}:
    TOKENIZERS = ['tokenizers==0.10.3']

@hankcs hankcs merged commit d8e6f2a into hankcs:dev Dec 18, 2024
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants