[Feature Request] Enhancing Unit Testing for FLA in the Context of Active Development and Diverse GPU Compatibility #209

uniartisan · 2025-03-01T11:14:28Z

Feature Request

Implement a comprehensive and phased unit testing strategy for FLA to ensure compatibility across a wide range of GPUs and improve the overall robustness of the project. This includes Refine the unit testing process to cover tests for different types of graphics cards and optimize the test triggering mechanism.

Motivation

Current Development State: FLA is in an active development phase. However, the existing unit tests have significant flaws. Some samples in the unit tests are incorrect, and the tests fail to execute properly on certain consumer - grade graphics cards.
User - Base Consideration: A large portion of the user community is using NVIDIA 30 - series and 40 - series graphics cards. Ensuring compatibility with these widely - used GPUs is crucial for user satisfaction and adoption.
GPU Scarcity: High - end GPUs like A100 and H100 are scarce. This scarcity necessitates a strategic approach to testing, starting with more accessible resources such as CPUs.

Your Contribution

Initial CPU - Based Testing:
a. Conduct extensive CPU emulation tests for all unit tests. This will help identify and fix basic functional issues without relying on scarce GPUs.
b. Focus on testing the changed files first. This targeted approach will save time and resources during the initial testing phase.
GPU Validation:
a. Once the CPU emulation tests are stable, perform unit tests on A100 and H100 GPUs. The goal is to make all unit tests pass on these high - end GPUs, as they are often used in professional and research settings related to the project.
Manual Full - Test Trigger:
a. Before any code is merged, the maintainers should manually trigger a full suite of unit tests. This ensures that the entire codebase, including newly changed and existing parts, is thoroughly tested.
Expanding to Consumer - Grade GPUs:
a. After achieving stability on A100 and H100, gradually introduce tests for NVIDIA 30 - series and 40 - series GPUs. This will address the needs of the majority of the user base.
b. Subsequently, add tests for Intel and AMD graphics cards. This will further expand the project's compatibility across different hardware platforms, making the project more robust and adaptable.

Triang-jyed-driung · 2025-03-04T05:37:51Z

Currently, chunk_dplr fails for both H800 and 4090 for Triton 3.1.0 and 3.2.0, only triton 3.0.0 nightly can run. It fails exactly as FAQ described.

uniartisan added enhancement New feature or request todo To be implemented labels Mar 1, 2025

uniartisan self-assigned this Mar 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Enhancing Unit Testing for FLA in the Context of Active Development and Diverse GPU Compatibility #209

[Feature Request] Enhancing Unit Testing for FLA in the Context of Active Development and Diverse GPU Compatibility #209

uniartisan commented Mar 1, 2025

Triang-jyed-driung commented Mar 4, 2025

[Feature Request] Enhancing Unit Testing for FLA in the Context of Active Development and Diverse GPU Compatibility #209

[Feature Request] Enhancing Unit Testing for FLA in the Context of Active Development and Diverse GPU Compatibility #209

Comments

uniartisan commented Mar 1, 2025

Feature Request

Motivation

Your Contribution

Triang-jyed-driung commented Mar 4, 2025