diff --git a/search/search_index.json b/search/search_index.json
index 4eb03f3a..102aa09c 100644
--- a/search/search_index.json
+++ b/search/search_index.json
@@ -1 +1 @@
-{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":"<p>Here we are documenting the processes and work of the AI Validation Team at the Ministry of the Interior and Kingdom Relations in The Netherlands.</p> <p>We are a team of engineers, UX designers &amp; researchers, and product experts at a policy department.</p> <p>We work on the following projects within the Transparency of Algorithmic Decision making scope:</p> <pre><code>graph TB\n    ak[&lt;a href='https://minbzk.github.io/Algoritmekader/'&gt;Algoritmekader&lt;/a&gt;] &lt;--&gt; tmt\n\n    subgraph tmt[Algorithm Management Toolkit]\n        st[&lt;a href='/ai-validation/projects/tad/reporting-standard/'&gt;Reporting Standard&lt;/a&gt;] --&gt; tad[&lt;a href='https://github.com/MinBZK/tad/'&gt;Algorithm Management Platform&lt;/a&gt;]\n        tad &lt;--&gt; llm[&lt;a href='/ai-validation/projects/llm-benchmarks/'&gt;LLM Benchmark Tooling&lt;/a&gt;]\n    end\n\n    tmt --&gt; ar[&lt;a href='https://algoritmes.overheid.nl/en/'&gt;The Algorithm Register of the Dutch government&lt;/a&gt;]\n    tmt --&gt; or[Other registries]</code></pre>"},{"location":"#contribute","title":"Contribute","text":"<p>Read our guide on how to contribute.</p>"},{"location":"#contact","title":"Contact","text":"<p>Our contact details are here.</p>"},{"location":"about/contact/","title":"Contact","text":"<p>Contact us at ai-validatie@minbzk.nl.</p>"},{"location":"about/team/","title":"Our Team","text":""},{"location":"about/team/#robbert-bos","title":"Robbert Bos","text":"<p>Product Owner</p> <p></p> <p>Robbert has been on a mission for over 15 years to enhance the transparency and collaboration within AI projects. Before joining this team, he founded several data science and tech companies (partly) dedicated to this cause. Robbert is passionate about solving complex problems where he connects business needs with technology and involves others in how these solutions can improve their work.</p> <p> robbertbos</p> <p> Robbert Bos</p>"},{"location":"about/team/#lucas-haitsma","title":"Lucas Haitsma","text":"<p>Researcher in Residence</p> <p></p> <p>Lucas is PhD candidate conducting research into the regulation and governance of algorithmic discrimination by supervision and enforcement organizations. Lucas is our Researcher in Residence.</p> <p> Lucas Haitsma</p> <p> rug.nl</p>"},{"location":"about/team/#berry-den-hartog","title":"Berry den Hartog","text":"<p>Engineer</p> <p></p> <p>Berry is a software engineer passionate about problem-solving and system optimization, with expertise in Go, Python, and C++. Specialized in architecting high-volume data processing systems and implementing Lean-Agile and DevOps practices. Experienced in managing end-to-end processes from hardware provisioning to software deployment and release.</p> <p> berrydenhartog</p> <p> Berry den Hartog</p>"},{"location":"about/team/#anne-schuth","title":"Anne Schuth","text":"<p>Engineering Manager</p> <p></p> <p>Anne used to be a Machine Learning Engineering Manager at Spotify and previously held roles at DPG Media, Blendle, and Google AI. He holds a PhD from the University of Amsterdam.</p> <p> anneschuth</p> <p> Anne Schuth</p> <p> anneschuth.nl</p>"},{"location":"about/team/#christopher-spelt","title":"Christopher Spelt","text":"<p>Engineer</p> <p></p> <p>After graduating in pure mathematics, Christopher transitioned into machine learning. He is passionate about solving complex problems, especially those that have a societal impact. My expertise lies in math, machine learning theory and I'm skilled in Python.</p> <p> ChristopherSpelt</p> <p> Christopher Spelt</p>"},{"location":"about/team/#robbert-uittenbroek","title":"Robbert Uittenbroek","text":"<p>Engineer</p> <p></p> <p>Robbert is a highly enthusiastic full-stack engineer with a Bachelor's degree in Computer Science from the Hanze University of Applied Sciences in Groningen. He is passionate about building secure, compliant, and ethical solutions, and thrives in collaborative environments. Robbert is eager to leverage his skills and knowledge to help shape and propel the future of IT within the government.</p> <p> uittenbroekrobbert</p> <p> Robbert Uittenbroek</p>"},{"location":"about/team/#laurens-weijs","title":"Laurens Weijs","text":"<p>Engineer</p> <p></p> <p>Laurens is a passionate guy with a love for innovation and doing things differently. With a background in Econometrics and Computer Science he loves to tackle the IT challenges of the Government by helping other people through extensive knowledge sharing on stage, building neural networks himself, or building a strong community.</p> <p> laurensWe</p> <p> Laurens Weijs</p>"},{"location":"about/team/#guusje-juijn","title":"Guusje Juijn","text":"<p>Trainee</p> <p></p> <p>Guusje is currently enrolled in a two-year traineeship at the Dutch Government. After finishing her first assignment at a policy department, she is excited to bring her knowledge about AI policy to a technical team. Guusje has a background in Artificial Intelligence, is experienced in Python and machine learning and has a strong interest in AI ethics.</p> <p> GuusjeJuijn</p> <p> Guusje Juijn</p>"},{"location":"about/team/#ruben-rouwhof","title":"Ruben Rouwhof","text":"<p>UX/UI Designer</p> <p></p> <p>Ruben is a dedicated UX/UI Designer focused on crafting user-centric digital experiences. He is involved in projects from start to finish, covering user research, design, and technical implementation.</p> <p> rubenrouwhof</p> <p> Ruben Rouwhof</p> <p> rubenrouwhof.nl</p>"},{"location":"about/team/#ravi-meijer","title":"Ravi Meijer","text":"<p>Product Researcher</p> <p></p> <p>Ravi is an accomplished data scientist with expertise in machine learning, responsible AI, and the data science lifecycle. Her background in AI fuels her passion for solving complex problems and driving innovation for positive social impact.</p> <p> ravimeijerrig</p> <p> Ravi Meijer</p>"},{"location":"about/team/#our-alumni","title":"Our Alumni","text":""},{"location":"about/team/#willy-tadema","title":"Willy Tadema","text":"<p>AI Ethics Lead</p> <p></p> <p>Willy specializes in AI governance, AI risk management, AI assurance and ethics-by-design. She is an advocate of AI standards and a member of several ethics committees.</p> <p> FrieseWoudloper</p> <p> Willy Tadema</p>"},{"location":"adrs/0001-adrs/","title":"ADR-0001 ADRs","text":""},{"location":"adrs/0001-adrs/#context","title":"Context","text":"<p>In modern software development practices, the use of Architecture Decision Records (ADRs) has become increasingly common. ADRs are documents that capture important architectural decisions made during the development process. These decisions play a crucial role in guiding the development team and ensuring consistency and coherence in the architecture of the software system.</p>"},{"location":"adrs/0001-adrs/#assumptions","title":"Assumptions","text":"<ol> <li>ADRs provide a structured way to document and communicate architectural decisions.</li> <li>Publishing ADRs publicly fosters transparency and facilitates collaboration among team members and stakeholders.</li> <li>ADRs help in onboarding new team members by providing insights into past decisions and their rationale.</li> </ol>"},{"location":"adrs/0001-adrs/#decision","title":"Decision","text":"<p>We will utilize ADRs in our team to document and communicate architectural decisions effectively. Furthermore, we will publish these ADRs publicly to promote transparency and facilitate collaboration.</p>"},{"location":"adrs/0001-adrs/#template","title":"Template","text":"<p>Use the template below to add an ADR:</p> <pre><code># ADR-XXXX Title\n\n## Context\n\nWhat is the issue that we're seeing that is motivating this decision or change?\n\n## Assumptions\n\nAnything that could cause problems if untrue now or later. (optional)\n\n## Decision\n\nWhat is the change that we're proposing and/or doing?\n\n## Risks\n\nAnything that could cause malfunction, delay, or other negative impacts. (optional)\n\n## Consequences\n\nWhat becomes easier or more difficult to do because of this change?\n\n## More Information\n\nProvide additional evidence/confidence for the decision outcome\nLinks to other decisions and resources might here appear as well. (optional)\n</code></pre>"},{"location":"adrs/0002-code-platform/","title":"ADR-0002 Code Platform","text":""},{"location":"adrs/0002-code-platform/#context","title":"Context","text":"<p>In the landscape of software development, the choice of coding platform significantly impacts developer productivity, collaboration, and code quality. it's crucial to evaluate and select a coding platform that aligns with our development needs and fosters efficient workflows.</p>"},{"location":"adrs/0002-code-platform/#assumptions","title":"Assumptions","text":"<p>The following assumptions are made:</p> <ul> <li>Our work should be visible to external teams to promote transparency and facilitate collaboration.</li> <li>The coding platform should be easily available for developers.</li> <li>The coding platform should offers collaboration tools between developers and the community.</li> <li>The coding platform should offers security and dependency management tools.</li> <li>The pricing model should be suitable for our budget and needs. Currently meaning no budgets.</li> </ul>"},{"location":"adrs/0002-code-platform/#decision","title":"Decision","text":"<p>After careful consideration and evaluation of various options like GitHub, GitLab and BitBucket, we propose adopting GitHub as our primary coding platform. The decision is based on the following factors:</p> <p>Costs: There are currently no costs associate in using GitHub for our use cases.</p> <p>Features and Functionality: GitHub offers a comprehensive set of features essential for modern software development and collaboration with external teams, including version control, code review, issue tracking, continuous integration, and deployment automation.</p> <p>Security: GitHub offers a complete set of security features essential to secure development like dependency management and security scanning.</p> <p>Community and Ecosystem: GitHub boasts a vibrant community and ecosystem, facilitating knowledge sharing, collaboration, and access to third-party tools, and services that can enhance our development workflows. Within our organization we have easy access to the team managing the GitHub organization.</p> <p>Usability and User Experience: A user-friendly interface and intuitive workflows are essential for maximizing developer productivity and minimizing onboarding time. GitHub offers a streamlined user experience and customizable workflows that align with our team's preferences and practices.</p>"},{"location":"adrs/0002-code-platform/#risks","title":"Risks","text":"<p>Currently the organization of MinBZK on GitHub does not have a lot of <code>people</code> indicating that our team is an early adapter of the platform within the organization. This might impact our features due to cost constrains.</p>"},{"location":"adrs/0002-code-platform/#consequences","title":"Consequences","text":"<p>If we choose another tool in the future we need to migrate our codebase, and potentially need to rewrite some specific GitHub features that cannot be used in another tool.</p>"},{"location":"adrs/0002-code-platform/#more-information","title":"More Information","text":"<p>Alternatives considered:</p> <ul> <li>BitBucket</li> <li>GitLab</li> <li>Forgejo</li> </ul>"},{"location":"adrs/0003-ci-cd/","title":"ADR-0003 CI/CD Tooling","text":""},{"location":"adrs/0003-ci-cd/#context","title":"Context","text":"<p>Our development team wants to implement a CI/CD solution to streamline the build, testing, and deployment workflows of our software products. Currently, our codebase resides on GitHub, and we leverage Kubernetes as our chosen orchestration platform, managed by the DigiLab platform team.</p>"},{"location":"adrs/0003-ci-cd/#decision","title":"Decision","text":"<p>We will use the following tools for CI/CD pipeline:</p> <ul> <li>Continuous Integration (CI): GitHub Actions will be employed to facilitate the automated testing of our applications.</li> <li>Continuous Deployment (CD): We will utilize Flux for managing the deployment process of our applications. Flux reads from github to deploy.</li> </ul>"},{"location":"adrs/0003-ci-cd/#consequences","title":"Consequences","text":"<p>GitHub Actions aligns with our existing infrastructure, ensuring seamless integration with our codebase and minimizing operational overhead. GitHub Actions' specific syntax for CI results in vendor lock-in, necessitating significant effort to migrate to an alternative CI system in the future.</p> <p>Flux, being a GitOps operator for Kubernetes, offers a declarative approach to managing deployments, enhancing reliability and repeatability within our Kubernetes ecosystem.</p>"},{"location":"adrs/0004-software-hosting-platform/","title":"ADR-0004 Software hosting platform","text":""},{"location":"adrs/0004-software-hosting-platform/#context","title":"Context","text":"<p>Our team recognizes the necessity of a platform to run our software, as our local machines lack the capacity to handle certain workloads effectively. We have evaluated several options available to us:</p> <ol> <li>Digilab Kubernetes</li> <li>Logius Kubernetes</li> <li>SSC-ICT VMs</li> <li>ODC Noord</li> </ol>"},{"location":"adrs/0004-software-hosting-platform/#assumptions","title":"Assumptions","text":"<p>We operate under the following assumptions:</p> <ul> <li>High availability is not a critical requirement for our software.</li> <li>Our team prioritizes low maintenance solutions.</li> </ul>"},{"location":"adrs/0004-software-hosting-platform/#decision","title":"Decision","text":"<p>We will use Digilab Kubernetes for our workloads.</p>"},{"location":"adrs/0004-software-hosting-platform/#consequences","title":"Consequences","text":"<p>By choosing Digilab Kubernetes, we gain access to a namespace within their managed Kubernetes cluster. However, it's important to note that Digilab does not provide any guarantees regarding the availability of the cluster. Should our software require higher availability assurances, we may need to explore alternative solutions.</p>"},{"location":"adrs/0005-python-tooling/","title":"ADR-0005 Python coding standard and tools","text":""},{"location":"adrs/0005-python-tooling/#context","title":"Context","text":"<p>In modern software development, maintaining code quality is crucial for readability, maintainability, and collaboration. Python, being a dynamically typed language, requires robust tooling to ensure code consistency and type safety. Manual enforcement of coding standards is time-consuming and error-prone. Hence, adopting automated tooling to streamline this process is imperative.</p>"},{"location":"adrs/0005-python-tooling/#decision","title":"Decision","text":"<p>We will use these standards and tools for our own projects:</p> <ul> <li>Google style guide</li> <li>Ruff<ul> <li>Rules: [I, SIM, B, UP, F, E]</li> <li>Formatter</li> </ul> </li> <li>Pyright: A static type checker for Python, ensuring type safety and reducing potential runtime errors.</li> <li>pre-commit: A framework for managing and maintaining multi-language pre-commit hooks.</li> </ul> <p>Working with external projects these coding standards will not always be possible. but we will try to integrate them as much as possible.</p>"},{"location":"adrs/0005-python-tooling/#consequences","title":"Consequences","text":"<p>Improved Code Quality: Adoption of these tools will lead to improved code quality, consistency, and maintainability across the project.</p> <p>Enhanced Developer Productivity: Automated code formatting and static type checking will reduce manual effort and free developers to focus more on coding logic rather than formatting and type-related issues.</p> <p>Reduced Bug Incidence: Static typing and linting will catch potential bugs and issues early in the development process, reducing the likelihood of runtime errors and debugging efforts.</p> <p>Standardized Development Workflow: By integrating pre-commit hooks, the development workflow will be standardized, ensuring that all developers follow the same code quality standards.</p>"},{"location":"adrs/0006-agile-tooling/","title":"ADR-0006 Agile tooling","text":""},{"location":"adrs/0006-agile-tooling/#context","title":"Context","text":"<p>Our development team wants to enhance transparency and productivity in our software development processes. We are using GitHub for version control and collaboration. However, to further streamline our process, there is a need to incorporate tooling for managing the effort of our team.</p>"},{"location":"adrs/0006-agile-tooling/#decision","title":"Decision","text":"<p>We will use GitHub Projects as our agile process tool</p>"},{"location":"adrs/0006-agile-tooling/#consequences","title":"Consequences","text":"<p>GitHub Projects seamlessly integrates with our existing GitHub repositories, allowing us to manage our Agile processes. within the same ecosystem where our code resides. This integration eliminates the need for additional third-party tools, simplifying our workflow.</p>"},{"location":"adrs/0007-commit-convention/","title":"ADR-0007 Commit convention","text":""},{"location":"adrs/0007-commit-convention/#context","title":"Context","text":"<p>In software development, maintaining clear and consistent commit message conventions is crucial for effective collaboration, code review, and project management. Commit messages serve as a form of documentation, helping developers understand the changes introduced by each commit without having to analyze the code diff extensively.</p>"},{"location":"adrs/0007-commit-convention/#decision","title":"Decision","text":"<p>A commit message must follow the following rules:</p> <ol> <li>The subject line (first line) MUST not be no longer than 50 characters</li> <li>The subject line MUST be in the imperative mood</li> <li>A sentences MUST have Capitalized first word</li> <li>The subject line MUST not end with a punctuation</li> <li>The body line length SHOULD be restricted to 72 characters</li> <li>The body MUST be separate by a blank line from the subject line if used</li> <li>The body SHOULD be used to explain what and why, not how.</li> <li>The body COULD end with a ticket number</li> <li>The Subject line COULD include a ticket number in the following format</li> </ol> <p>\\&lt;ref&gt;-\\&lt;ticketnumber&gt;: subject line</p> <p>An example of a commit message:</p> <p>Fix foo to enable bar</p> <p>or</p> <p>AB-1234: Fix foo to enable bar</p> <p>or</p> <p>Fix foo to enable bar</p> <p>This fixes the broken behavior of component abc caused by problem xyz.</p> <p>If we contribute to projects not started by us we try to follow the above standard unless a specific convention is obvious or required by the project.</p>"},{"location":"adrs/0007-commit-convention/#consequences","title":"Consequences","text":"<p>In some repositories Conventional Commits are used. This ADR does not follow conventional commits.</p>"},{"location":"adrs/0008-architectural-diagram-tooling/","title":"ADR-0008 Architectural Diagram Tooling","text":""},{"location":"adrs/0008-architectural-diagram-tooling/#context","title":"Context","text":"<p>To communicate our designs in a graphical manner, it is of importance to draw architectural diagrams. For this we use tooling, that supports us in our work. We need to have something that is written so that it can be processed by both people and machine, and we want to have version control on our diagrams.</p>"},{"location":"adrs/0008-architectural-diagram-tooling/#decision","title":"Decision","text":"<p>We will write our architectural diagrams in Markdown-like (.mmmd) in the Mermaid Syntax to edit these diagrams one can use the various plugins. For each project where it is needed, we will add the diagrams in the repository of the subject. The level of detail we will provide in the diagrams is according to the C4-model metamodel on architecture diagramming.</p>"},{"location":"adrs/0008-architectural-diagram-tooling/#consequences","title":"Consequences","text":"<p>Standardized Workflow: By maintaining architecture as code, it will be standardized in our workflow.</p> <p>Version control on diagrams: By using version control, we will be able to collaborate easier on the diagrams, and we will be able to see the history of them.</p> <p>Diagrams are in .md format: By storing our diagrams next to our code, it will be where you need it the most.</p>"},{"location":"adrs/0010-container-registry/","title":"ADR-0010 Container Registry","text":""},{"location":"adrs/0010-container-registry/#context","title":"Context","text":"<p>Containers allow us to package and run applications in a standardized and portable way. To be able to (re)use and share images, they need to be stored in a registry that is accessible by others.</p> <p>There are many container registries. During research the following registries have been noted:</p> <p>Docker Hub, GitHub Container Registry, Amazon Elastic Container Registry (ECR), Azure Container Registry (ACR), Google Artifact Registry (GAR), Red Hat Quay, GitLab Container Registry, Harbor, Sonatype Nexus Repository Manager, JFrog Artifactory.</p>"},{"location":"adrs/0010-container-registry/#assumptions","title":"Assumptions","text":"<ul> <li>We do not want to host our own registry.</li> <li>The images we create can be kept private or publicly shared.</li> <li>For development and testing, images should be kept private to prevent accidental use of unfinished products.</li> <li>Images we provide are safe and secure. This means a container registry should have the option to (continuously) verify the security status of an image.</li> <li>By configuration, tags can be made immutable, to prevent image tags from being overwritten.</li> <li>The registry keeps logs of events regarding containers.</li> <li>The registry needs to have a Role Based Access model.</li> <li>No additional sign up is required to pull the image</li> <li>We can use a kubernetes authorization token to pull images.</li> <li>The registry has support for scheduled deletion of images by criteria.</li> </ul>"},{"location":"adrs/0010-container-registry/#decision","title":"Decision","text":"<p>We will use GitHub Container Registry.</p> <p>This aligns best with the previously made choices for GitHub as a code repository and CI/CD workflow.</p>"},{"location":"adrs/0010-container-registry/#risks","title":"Risks","text":"<p>Traditionally, Docker Hub has been the place to publish images. Therefore, our images may be more difficult to discover.</p> <p>The following assumptions are not (directly) covered by the chosen registry:</p> <ul> <li>Security scans are not implemented by default, meaning we should find another solution to cover this, for example by using a GitHub Action.</li> <li>Private packages are limited by space and an additional license may be required, see Billing for GitHub Packages.</li> <li>It is unclear if it is possible to overwrite tags.</li> <li>Removing images by criteria is not implemented by default, but could be solved using a GitHub Action.</li> </ul>"},{"location":"adrs/0010-container-registry/#consequences","title":"Consequences","text":"<p>By using GitHub Container Registry we have a container registry we can use both internally as well as share with others. This has low impact, we can always move to another registry since the Open Container Initiative is standardized.</p>"},{"location":"adrs/0010-container-registry/#more-information","title":"More Information","text":"<p>The following sites have been consulted:</p> <ul> <li>Bluelight 'How to choose a container registry'</li> <li>G2 container-registry</li> <li>slashdot container registries</li> <li>Sourceforge Container Registries</li> <li>G2 Alternative Registries</li> <li>Security controls for container registries</li> </ul>"},{"location":"adrs/0011-researcher-in-residence/","title":"ADR-0011 Researcher in Residence","text":""},{"location":"adrs/0011-researcher-in-residence/#context","title":"Context","text":"<p>The AI validation team works transparently. Working with public funds warrants transparency toward the public. Additionally, being transparent aligns with the team's mission of increasing the transparency of public organizations. In line with this reasoning, it is important to be open to researchers interested in the work of the AI validation team. Allowing researchers to conduct research within the team contributes to transparency and enables external perspectives and feedback to be incorporated into the team's work.</p>"},{"location":"adrs/0011-researcher-in-residence/#assumptions","title":"Assumptions","text":"<ul> <li>Having researchers in residence is a mechanism that facilitates transparency.</li> <li>Research enables external feedback and perspectives to be incorporated into the team's work.</li> <li>Having researchers in residence from other disciplines facilitates an interdisciplinary exchange of ideas in light of   interdisciplinary issues.</li> <li>Research outcomes enable knowledge exchange between the scientific community and public organizations, in the   Netherlands and abroad.</li> </ul>"},{"location":"adrs/0011-researcher-in-residence/#decision","title":"Decision","text":"<p>We have decided to include a researcher in residence as a member of our team.</p> <p>The researcher in residence takes the following form:</p> <ul> <li>They join and participate in refinement meetings and other meetings of relevance.</li> <li>They share their reflections and present their (interim) research findings.</li> <li>They are provided access to the communication channel of the team (Mattermost).</li> <li>They meet with a contact person bi-weekly to discuss questions and relevant progress and findings.</li> <li>The are independent as they are employed and financed by their respective university.</li> <li>The results of their research, and thus processed data, are published in an academic journal.</li> </ul> <p>The following conditions apply to the researcher in residence.</p> <ul> <li>The research is able to access and analyze documents relevant to the research (ex. Notes/minutes taken). These   documents are submitted to a member of the team for review prior to being processed.</li> <li>Any data collected and analyzed via interviews is done on the basis of informed consent.</li> <li>The data collected is not shared with other parties and is handled ethically by the researcher.</li> <li>No information, aggregation, or summary of information from the communication channel of the team (Mattermost) is   collected, processed, or analyzed by the researcher.</li> </ul>"},{"location":"adrs/0011-researcher-in-residence/#risks","title":"Risks","text":"<p>Risks around a potential chilling effect (team members not feeling free to express themselves) are mitigated by the conditions we impose. In light of aforementioned form and conditions above, we see no further significant risks.</p>"},{"location":"adrs/0011-researcher-in-residence/#consequences","title":"Consequences","text":"<p>Including a researcher in residence makes it easier for them to conduct research within both the team and the wider organization where the AI validation team operates. This benefits the quality of the research findings and the feedback provided to the team and organization.</p>"},{"location":"adrs/0012-dictionary-for-spelling/","title":"ADR-0012 Dictionary for spelling","text":""},{"location":"adrs/0012-dictionary-for-spelling/#context","title":"Context","text":"<p>We use English as language in some of our external communications, like on GitHub. We noticed that among different documents certain words are spelled correctly but differently, depending on the author or dictionary used. Also there are occasional typos which can cause distraction and don't meet professional standards.</p>"},{"location":"adrs/0012-dictionary-for-spelling/#assumptions","title":"Assumptions","text":"<p>Standardizing the used dictionary avoids discussion on spelling and makes documents consistent. Eliminating typos contributes to professional, credible and unambiguous documents.</p> <p>Using a dictionary in a pre-commit hook will prevent commits being made with obvious spelling issues.</p>"},{"location":"adrs/0012-dictionary-for-spelling/#decision","title":"Decision","text":"<p>We will use the U.S. English spelling dictionary.</p>"},{"location":"adrs/0012-dictionary-for-spelling/#risks","title":"Risks","text":"<p>It may slow down committing large files.</p>"},{"location":"adrs/0012-dictionary-for-spelling/#consequences","title":"Consequences","text":"<p>Documents will all use the same dictionary for spelling and will not contain typos.</p>"},{"location":"adrs/0013-date-time-representation/","title":"ADR-0013 Date Time Representation: ISO 8601","text":""},{"location":"adrs/0013-date-time-representation/#context","title":"Context","text":"<p>In our software development projects, we have encountered ambiguity related to the representation of dates and times, particularly when dealing with time zones. The lack of a standardized approach has led to discussions and possibly ambiguity when interpreting timestamps within our applications.</p>"},{"location":"adrs/0013-date-time-representation/#assumptions","title":"Assumptions","text":"<p>Standardizing the representation of dates and times will improve clarity and precision in our application's logic and user interfaces.</p> <p>ISO 8601 format is better human-readable than other formats such as unix timestamps.</p>"},{"location":"adrs/0013-date-time-representation/#decision","title":"Decision","text":"<p>We adopt ISO 8601 with timezone notation, preferably in UTC (<code>Z</code>), as the standard method for representing dates and times in our software projects, replacing the usage of Unix timestamps or any other formats or timezones. We use both dashes (<code>-</code>) and colons (<code>:</code>).</p> <p>We store date and time as: <code>2024-04-16T16:48:14Z</code> (preferably with <code>Z</code> as timezone, representing UTC)</p> <p>We store dates as <code>2024-04-16</code>.</p> <p>Only when capturing client events we may want to choose to store the client timezone instead of UTC.</p> <p>When rendering a date and time in a user interface, we may want to localize the date and time for the appropriate timezone.</p>"},{"location":"adrs/0013-date-time-representation/#risks","title":"Risks","text":"<p>Increased storage space: ISO 8601 representations can be longer than other formats, leading to potential increases in storage requirements, especially when dealing with large datasets.</p>"},{"location":"adrs/0013-date-time-representation/#consequences","title":"Consequences","text":"<p>A single ISO 8601 with UTC timezone provides a clear and unambiguous way to represent dates and times. Its format is easily recognizable and eliminates the need for interpretation. For example: <code>2024-04-15T10:00:00Z</code> can easily be understood without needing to parse it using a library.</p> <p>We will need to regularly convert from localized time to UTC and back when capturing, storing, and rendering dates and times.</p>"},{"location":"adrs/0013-date-time-representation/#more-information","title":"More Information","text":"<p>ISO 8601 is an internationally recognized standard endorsed by the International Organization for Standardization (ISO). Its adoption offers numerous benefits, including improved clarity, global accessibility, and future-proofing of systems and applications.</p> <p>For further reading on ISO 8601:</p> <ul> <li>Forum Standaardisatie - Datum en Tijd</li> <li>ISO 8601-1:2019 - Date and Time Format</li> <li>ISO 8601 - Wikipedia</li> </ul>"},{"location":"adrs/0014-written-language/","title":"ADR-0014 Written Language","text":""},{"location":"adrs/0014-written-language/#context","title":"Context","text":"<p>In order to expand our reach and foster international collaboration in the field of AI Validation, we have decided to conduct all communication in English on public platforms such as GitHub. This decision aims to facilitate better understanding and participation from our global colleagues. However, within the Government of the Netherlands, the norm is to communicate in Dutch for internal purposes. This ADR will provide guidelines on which language to use for different types of communications.</p>"},{"location":"adrs/0014-written-language/#assumptions","title":"Assumptions","text":"<p>There is no requirement to use Dutch as the primary language for all our activities while working for the Government of the Netherlands. More information can be found in the More Information section.</p>"},{"location":"adrs/0014-written-language/#decision","title":"Decision","text":"<p>The following channels will utilize English:</p> <ul> <li>GitHub projects</li> <li>GitHub repository</li> <li>Email to international partners</li> </ul> <p>The primary language for the following channels will be Dutch:</p> <ul> <li>Mattermost (internal collaboration tool)</li> <li>Emails to internal parties</li> <li>Official internal documents like memo's or notes to the house of representatives</li> <li>Guides for Dutch users on how to use the tools</li> <li>UI for the tools we will make</li> </ul>"},{"location":"adrs/0014-written-language/#risks","title":"Risks","text":"<p>Dutch-only developers will have a harder time following along with the progression of our team on both the code on GitHub as our Project Management.</p>"},{"location":"adrs/0014-written-language/#consequences","title":"Consequences","text":"<ul> <li>All tickets and issues will be written in English on GitHub projects.</li> <li>Code reviews will be written in English.</li> <li>Comments in the code and commit messages will be written in English.</li> <li>Documentation of the tools will be written in both Dutch and English.</li> </ul>"},{"location":"adrs/0014-written-language/#more-information","title":"More Information","text":"<p>Although many attempts by previous cabinets, Dutch is not the official language in the Netherlands according to the Dutch constitution. See the following link.</p> <p>According to the website of the Government of the Netherlands the Dutch language is the official recognized language. This means that in combination with the law <code>Algemene wet bestuursrecht</code> on wetten.overheid.nl governing bodies and their employees need to communicate in Dutch unless stated differently elsewhere. It is stated here that communicating in another language than Dutch is permitted if the goal of communicating in another language than Dutch is sufficiently justified and if other parties are not effected disproportionately by the usage of another language.</p>"},{"location":"adrs/0016-government-cloud-comparison/","title":"ADR-0016 Government Cloud Comparison","text":""},{"location":"adrs/0016-government-cloud-comparison/#context","title":"Context","text":"<p>Right now we have a few organizations (Logius, SSC-ICT, ODC-Noord, Tender process, and Digilab, etc...) offering IT infrastructure. This ADR will give an overview of what these different organizations are offering as well as make a decision for the AI Validation team on which infrastructure provider we will focus.</p>"},{"location":"adrs/0016-government-cloud-comparison/#descriptions-and-comparison","title":"Descriptions and comparison","text":"<ul> <li>SSC-ICT<ul> <li>Description:<ul> <li>SSC-ICT is an ICT service provider for some ministries of the government of the Netherlands. In the   service catalogue of 2024   no mention of Kubernetes is in the document. They are specialized in workplace management but through an   NSK(not standard client request) extra services could be provided by them.</li> </ul> </li> <li>Pros:<ul> <li>The integration with the RON (Rijksoverheid Network) is managed well because that is a service that SSC-ICT also manages.</li> </ul> </li> <li>Cons:<ul> <li>They are known to be very bureaucratic, a standard NSK can take up at least half a year and then still you don't have what you want.</li> </ul> </li> </ul> </li> <li>Standaard Platform (Logius)<ul> <li>Description:<ul> <li>The standard Platform of Logius will provide an Openshift Kubernetes namespace for you</li> </ul> </li> <li>Pros:</li> <li>Cons:</li> </ul> </li> <li>ODC-Noord<ul> <li>Description:<ul> <li>ODC-Noord provides multiple services, on one hand it can provide:         - Platform as a Service, with this service you can set-up Virtual machines with specific open source software packages. However, this is not a scalable service, as you are limited to quotas you have on a project and limits of the virtual instance.         - Another service is Infrastructure as a Service, here you can set-up anything you want on Openshift.         - There are several specialized services on ODC-Noord as well for development street or data science but these are not suitable for hosting a custom application.</li> </ul> </li> <li>Pros:<ul> <li>It is another governmental party which makes communication and commitment easier.</li> </ul> </li> <li>Cons:<ul> <li>ODC-Noord stated that they will not invest in GPUs, which would limit our Machine Learning Jobs potential.</li> <li>The Platform as a Service is less scalable then we would like to see.</li> </ul> </li> </ul> </li> <li>Digilab<ul> <li>Description:<ul> <li>Digilab will provide an Openshift Kubernetes namespace for you, but also managed services like Mattermost.</li> </ul> </li> <li>Pros:<ul> <li>The platform is made such based on the vision of Common Ground, and thus to standardize cloud hosting through Haven for all Dutch municipalities. This standardization improves on integration later on.</li> </ul> </li> <li>Cons:</li> </ul> </li> <li>Tender Process Aanbestedingswet<ul> <li>Description:<ul> <li>If you don't want to make use of the governmental parties stated above you could go to the free market to provide infrastructure for you. As the government cannot simply find a party to implement this for them, you need to go through a tender process as described in the law stated above in the title.</li> </ul> </li> <li>Pros:<ul> <li>The process of acquiring this is open and transparent for everybody.</li> </ul> </li> <li>Cons:<ul> <li>Takes a while as generally just like with SSC-ICT you need to write a whole set of documents to specify what you exactly want and you cannot change this in the meantime.</li> </ul> </li> </ul> </li> <li>SLM Rijk<ul> <li>Description:<ul> <li>The Rijksoverheid has made Strategic Delivery Agreement that with certain restrictions public cloud providers can be used by the Dutch Government.</li> </ul> </li> <li>Pros:<ul> <li>It is very easy to set-up infrastructure with the big international cloud providers.</li> </ul> </li> <li>Cons:<ul> <li>With our team we decided that we prefer open source solutions. principles.md. So if we use some managed solutions of the big cloud providers we would not be following our principles.</li> </ul> </li> </ul> </li> <li>DICTU<ul> <li>Description:<ul> <li>DICTU is a governmental party which can will develop custom managed solutions on their own cloud, DICTU is part of the Ministry of Economic Affairs but is also available for other ministries. It is rumoured that just like other parties it could provide just some Kubernetes namespaces for you.</li> </ul> </li> <li>Pros:<ul> <li>They have an impressive track record and can deliver production ict services well.</li> </ul> </li> <li>Cons:<ul> <li>You need to do stakeholder management if you make use of their services instead of changing the infrastructure yourself.</li> </ul> </li> </ul> </li> <li>ICTU<ul> <li>Description:<ul> <li>ICTU is like DICTU also a governmental party, but then exists by law instead of under a ministry.</li> </ul> </li> <li>Pros:<ul> <li>They have an impressive track record and can deliver production ict services well.</li> </ul> </li> <li>Cons:<ul> <li>You need to do stakeholder management if you make use of their services instead of changing the infrastructure yourself.</li> </ul> </li> </ul> </li> </ul> <p>Please see the following picture for an overview of the providers in relation to what they can provide, currently we are heavily searching in the realm of unmanaged infrastructure, as we want this to manage ourselves.</p> <p></p>"},{"location":"adrs/0016-government-cloud-comparison/#decision","title":"Decision","text":"<p>For our infrastructure provider we decided to go with Digilab as the main source, as they can provide us with a Kubernetes namespace and are a reliable and convenient partner as we work closely with them.</p>"},{"location":"adrs/0016-government-cloud-comparison/#risks","title":"Risks","text":"<p>Certain choices are made for us if we make use of the Kubernetes namespace of Digilab, for example that we need to make use of Flux for our CI/CD pipeline.</p>"},{"location":"adrs/0016-government-cloud-comparison/#extra-information","title":"Extra information","text":"<ul> <li>Standaard Platform<ul> <li>Internal Document</li> </ul> </li> </ul>"},{"location":"projects/llm-benchmarks/","title":"LLM Benchmarks","text":""},{"location":"projects/llm-benchmarks/#context","title":"Context","text":"<p>Large Languages Models (LLMs) are becoming increasingly popular in assisting people in a variety of tasks. These tasks include, but are not limited to, information retrieval, assisting with coding and essay writing. In the context of the government, tasks can include for example supporting Freedom of Information Act (FOIA) requests and aiding in answering questions of citizens.</p> <p>While the potential benefit of using LLMs is large, there are also significant risks. Basically an LLM is just a next token predictor, which bases its predictions on the user input (context) and on compressed information seen during training (LLM parameters); hence there is no guarantee on the quality and correctness of the output. Moreover, due to bias in the training data, LLMs can have bias in their output, despite best efforts to mitigate this. Additionally, we have human values that we expect LLMs to be aligned with. Certainly, within the context of a government, we should take utmost care not to discriminate. To assess the quality, correctness, bias and alignment with human values of an LLM one can perform benchmarks.</p>"},{"location":"projects/llm-benchmarks/#the-project","title":"The project","text":"<p>The LLM Benchmarks project of the AI Validation Team aims to create a platform where LLMs can be measured across a wide range of benchmarks. We limit ourselves to LLMs and benchmarks that are related to the Dutch society. Both LLMs and the benchmarks can be configured by users of the platform. Users can run these benchmarks on LLMs on our platform. The intended goal of this project is to give government organizations, citizens and companies insight in the various LLMs and their quality, correctness, bias and alignment with human values. The project also encompasses a dashboard with uploaded LLMs and their performance on uploaded benchmarks. With this platform we aim to enhance public trust in the usage of LLMs and expose potential bias that exists within LLMs.</p>"},{"location":"projects/tad/","title":"TAD","text":"<p>TAD is the acronym for Transparency of Algorithmic Decision making. TAD has the goal to make algorithmic systems more transparent; it achieves this by generating standardized reports on the algorithmic system which encompasses both technical aspects in addition to descriptive information about the system and regulatory assessments. For both the system and the model the lifecycle is important and this needs to be taken into account. The definition for an algorithm is derived from the Algoritmeregister.</p> <p>One of the goals of the TAD project is providing a standardized format of reporting on a algorithmic system by developing a Reporting Standard. This Reporting Standard consists out of a System Card which contains Model Cards and Assessment Cards.</p> <p>The final result of the project is producing System, Model and Assessment Cards with both performance metrics and technical measurements on fairness and bias of the model, assessments on the system where the specific algorithm resides, and descriptive information about the system.</p> <p>The requirements and instruments are dictated by the Algoritmekader.</p>"},{"location":"projects/tad/comparison/","title":"Comparison of Reporting Standards","text":"<p>This document assesses standards that standardize the way algorithm assessments can be captured.</p>"},{"location":"projects/tad/comparison/#background","title":"Background","text":"<p>There are many algorithm assessments (e.g. IAMA, HUIDERIA, etc.), technical tests on performance (e.g. Accuracy, TP, FP, F1, etc), fairness and bias of algorithms (e.g. SHAP) and reporting formats available. The goal is to have a way of standardizing the way these different assessments and tests can be captured.</p>"},{"location":"projects/tad/comparison/#available-standards","title":"Available standards","text":""},{"location":"projects/tad/comparison/#model-cards","title":"Model Cards","text":"<p>The most interesting existing capturing methods seem to be all based on Model Cards for Model Reporting, which are:</p> <p>\"Short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information\", proposed by Google. Note that \"The proposed set of sections\" in the Model Cards paper \"are intended to provide relevant details to consider, but are not intended to be complete or exhaustive, and may be tailored depending on the model, context, and stakeholders.\"</p> <p>Many companies implement their own version of Model Cards, for example Meta System Cards and the tools mentioned in the next section.</p>"},{"location":"projects/tad/comparison/#automatic-model-card-generation","title":"Automatic model card generation","text":"<p>There exist tools to (semi)-automatically generate models cards:</p> <ol> <li>Model Card Generator by US Sensus Bureau. Basic UI to create model cards and export to markdown, also has a command line tool.</li> <li>Model Card Toolkit by Google. Automation only comes from integration with TensorFlowExtended and ML Metadata.</li> <li>VerifyML. Based on the Google toolkit, but is extended to include specific tests on fairness and bias. Technical tests can be added by users and model card schema (in protobuf) can be extended by users.</li> <li>Experimental Model Cards Tool by Hugging Face. This is the implementation of the Google paper by Hugging Face and provides information on the models available on their platform. The writing tools guides users through their model card and allows for up- and downloading from and to markdown.</li> </ol>"},{"location":"projects/tad/comparison/#other-standards","title":"Other standards","text":"<p>A landscape analysis of ML documentation tools has been performed by Hugging Face and provides a good overview of the current landscape.</p> <p>Another interesting standard is the Algorithmic Transparency Recording Standard of the United Kingdom Government, which can be found here.</p>"},{"location":"projects/tad/comparison/#proposal","title":"Proposal","text":"<p>We need a standard that captures algorithmic assessments and technical tests on model and datasets. The idea of model cards can serve as a guiding theoretical principle on how to implement such a standard. More specifically, we can draw inspiration from the existing model card schema's and implementations of VerifyML and Hugging Face. We note the following:</p> <ol> <li>None of these two standards capture algorithmic assessments.</li> <li>Only VerifyML has a specific format to capture some technical tests.</li> </ol> <p>Hence in any case we need to extend one of these standards. We propose to:</p> <ol> <li>Assess and compare these two standards</li> <li>Chose the most appropriate one to extend</li> <li>Extend (and possibly adjust) this standard to our own standard (in the form of a basic schema) that allows for capturing algorithmic assessments and standardizes the way technical tests can be captured.</li> </ol>"},{"location":"projects/tad/adrs/0001-adrs/","title":"TAD-0001 ADRs","text":""},{"location":"projects/tad/adrs/0001-adrs/#context","title":"Context","text":"<p>In modern software development practices, the use of Architecture Decision Records (ADRs) has become increasingly common. ADRs are documents that capture important architectural decisions made during the development process. These decisions play a crucial role in guiding the development team and ensuring consistency and coherence in the architecture of the software system.</p>"},{"location":"projects/tad/adrs/0001-adrs/#assumptions","title":"Assumptions","text":"<ol> <li>ADRs provide a structured way to document and communicate architectural decisions.</li> <li>Publishing ADRs publicly fosters transparency and facilitates collaboration among team members and stakeholders.</li> <li>ADRs help in onboarding new team members by providing insights into past decisions and their rationale.</li> </ol>"},{"location":"projects/tad/adrs/0001-adrs/#decision","title":"Decision","text":"<p>We will utilize ADRs in this project repository and communicate architectural decisions effectively. Furthermore, we will publish these ADRs publicly to promote transparency and facilitate collaboration.</p>"},{"location":"projects/tad/adrs/0001-adrs/#template","title":"Template","text":"<p>Use the template below to add an ADR:</p> <pre><code># TAD-XXXX Title\n\n## Context\n\nWhat is the issue that we're seeing that is motivating this decision or change?\n\n## Assumptions\n\nAnything that could cause problems if untrue now or later. (optional)\n\n## Decision\n\nWhat is the change that we're proposing and/or doing?\n\n## Risks\n\nAnything that could cause malfunction, delay, or other negative impacts. (optional)\n\n## Consequences\n\nWhat becomes easier or more difficult to do because of this change?\n\n## More Information\n\nProvide additional evidence/confidence for the decision outcome\nLinks to other decisions and resources might here appear as well. (optional)\n</code></pre>"},{"location":"projects/tad/adrs/0002-tad-reporting-standard/","title":"TAD-0002 TAD Reporting Standard","text":""},{"location":"projects/tad/adrs/0002-tad-reporting-standard/#context","title":"Context","text":"<p>The TAD Reporting Standard proposes a standardized way of capturing information of ML-models and systems.</p>"},{"location":"projects/tad/adrs/0002-tad-reporting-standard/#assumptions","title":"Assumptions","text":"<p>There is no existing standard of capturing all relevant information on ML-models that also includes fairness and bias tests and regulatory assessments.</p> <p>A widely used implementation for Model Cards for Model Reporting is given by the Hugging Face Model Card metadata specification, which in turn is based on Papers with Code Model Index. This implementation does not capture sufficient details about metrics and does not include measurements from technical tests on bias and fairness or regulatory assessments.</p>"},{"location":"projects/tad/adrs/0002-tad-reporting-standard/#decision","title":"Decision","text":"<p>We decided to implement a custom reporting standard. Our reporting standard can be split up into three elements.</p> <ol> <li>System Card, containing information about a group of ML-models which accomplish a specific task. A System Card can refer to multiple Model Cards, either a Model Card specified by the TAD Reporting Standard, or any other model card. A System Card can refer to multiple Assessment Cards.</li> <li>Model Card, containing information about a specific ML-model.</li> <li>Assessment Card, containing information about a regulatory assessment.</li> </ol> <p>We were heavily inspired by the Hugging Face Model Card metadata specification, which we essentially extended to allow for:</p> <ol> <li>More fine-grained information on performance metrics.</li> <li>Capturing additional measurements on fairness and bias.</li> <li>Capturing regulatory assessments.</li> </ol> <p>The extension is not strict, meaning that there the TAD Reporting Standard is not a valid Hugging Face metadata specification. The reason for this is that some fields in the Hugging Face standard are too much intertwined with the Hugging Face ecosystem and it would not be logical for us to couple our implementation this tightly to Hugging Face.</p>"},{"location":"projects/tad/adrs/0002-tad-reporting-standard/#risks","title":"Risks","text":"<p>The TAD Reporting Standard is not fully backwards compatible with the Hugging Face Model Card metadata specification. If in the future the Hugging Face Model Card metadata specification becomes a standard, we might need to revise the TAD standard.</p>"},{"location":"projects/tad/adrs/0002-tad-reporting-standard/#consequences","title":"Consequences","text":"<p>The TAD Reporting Standard allows us to capture relevant information on model performance, bias and fairness and regulatory assessments in a standardized way.</p>"},{"location":"projects/tad/adrs/0003-tad-tool/","title":"TAD-0003 Tool for Transparency of Algorithmic Decision making","text":""},{"location":"projects/tad/adrs/0003-tad-tool/#context","title":"Context","text":"<p>We are considering tooling for organizations to get more grip on their algorithms. Tooling for, for instance bias and fairness tests, and assessments (like IAMA).</p> <p>Transparency, we think, can be fostered by sharing reports from such a tool in a standardized way.</p> <p>There are several existing open source tools which we have assessed. Some support only assessments, others already combine more features and can generate a report. There is however no tool that supports all the requirements we have.</p> <p>These are our main requirements of our tool:</p> <ul> <li>it offers a one-stop shop for all aspects of the project for all stakeholders for all tasks.</li> <li>it supports a workflow to track different stages of the project, it should support lifecycle management.</li> <li>it can run many technical tests on a model.</li> <li>it supports filling out assessments.</li> <li>it supports capturing discussion and collaboration around technical tests and assessments,   with features like f.e. mentions, (email) notifications and status updates.</li> <li>it offers UI access to the above requirements.</li> <li>it can save results to a system card (or cards supported by system cards, like model, metrics and assessment).</li> <li>it can commit the different cards to a VCS, such as git, to allow for an audit trail.</li> <li>it supports a multilingual user interface, initially in Dutch and in the future Frisian.</li> <li>it supports multiple projects with one or multiple algorithms.</li> </ul>"},{"location":"projects/tad/adrs/0003-tad-tool/#assumptions","title":"Assumptions","text":"<ul> <li>Collaborating or extending another project will not give us the tool we are looking for.</li> <li>We can reuse certain components, like the   plugins from AIVerify, or existing libraries,   for technical tests.</li> <li>The tool will use a design based on loose coupled modules. This can be done by scanning directories,   to gather modules that implement a certain interface.</li> <li>Plugins will have to implement an interface to be picked up by the system.</li> <li>We can, to some extend, re-use the already existing POCs or findings from these POCs.</li> </ul>"},{"location":"projects/tad/adrs/0003-tad-tool/#decision","title":"Decision","text":"<p>We will build our own solution. Where possible this solution should be able to re-use certain components of other related open-source projects.</p>"},{"location":"projects/tad/adrs/0003-tad-tool/#risks","title":"Risks","text":"<ul> <li>It is a lot of work to create our own tool.</li> <li>We may not have sufficient knowledge of existing technical tests to incorporate them successfully.</li> <li>We may struggle to get uptake of the tool if we are not aligned with envisioned users of the tool.</li> </ul>"},{"location":"projects/tad/adrs/0003-tad-tool/#consequences","title":"Consequences","text":"<p>We can develop a solution that is tailored to the needs of our stakeholders.</p>"},{"location":"projects/tad/adrs/0004-software-stack/","title":"TAD-0004 Software Stack for TAD","text":""},{"location":"projects/tad/adrs/0004-software-stack/#context","title":"Context","text":"<p>For building our own TAD solution, we need to choose a software stack. During our earlier POCs and market research, we gathered insight and information on technologies to use and which not to use.</p> <p>During further discussions and brainstorm sessions, a software stack was chosen that accommodates our needs best.</p> <p>While more fine grained requirements are listed elsewhere, some key requirements are:</p> <ul> <li>The tool is locally runnable with docker.</li> <li>The tool is runnable as user as a local Docker solution.</li> <li>The tool is runnable on a cloud platform based on Kubernetes.</li> <li>The tool supports multiple organizations, teams and projects.</li> <li>The tool supports Oauth2.</li> </ul>"},{"location":"projects/tad/adrs/0004-software-stack/#assumptions","title":"Assumptions","text":"<p>We stick to suitable programming languages. As most AI related tooling is written in Python, this language is the logical choice for our development as well.</p> <p>Currently we do not see the need for a separate web GUI framework. it is preferred to bundle backend and frontend in one solution.</p> <p>As part of a Dutch government organization, we need to adhere to all dutch laws and standards, like:</p> <ul> <li>Digitoegankelijk</li> <li>WCAG Guidelines</li> <li>Forum Standaardisatie</li> </ul>"},{"location":"projects/tad/adrs/0004-software-stack/#decision","title":"Decision","text":""},{"location":"projects/tad/adrs/0004-software-stack/#programming-language","title":"Programming language","text":"<p>We will support the latest 3 minor version of Python v3 as programming language and Poetry for dependency management.</p>"},{"location":"projects/tad/adrs/0004-software-stack/#backend","title":"Backend","text":"<p>The Python backend will use the following key dependencies:</p> <ul> <li>Granian as HTTP server.</li> <li>Pydantic for data validation.</li> <li>FastAPI for REST/API/HTML communication.</li> <li>Jinja2 for templating.</li> <li>i18n Extension for multilingual support  with gettext or Babel.</li> </ul>"},{"location":"projects/tad/adrs/0004-software-stack/#frontend","title":"Frontend","text":"<p>We will use serverside rendering of HTML, based on HTMX. For styling and components we will use NL Design System.</p>"},{"location":"projects/tad/adrs/0004-software-stack/#testing","title":"Testing","text":"<p>We will use pytest for unit-testing and VCRPY and Playwright for module and integration tests.</p>"},{"location":"projects/tad/adrs/0004-software-stack/#database","title":"Database","text":"<p>We will use SQLModel or SQL Alchemy with SQLite for development and postgreSQL for production.</p>"},{"location":"projects/tad/adrs/0004-software-stack/#risks","title":"Risks","text":"<p>As HTMX is relatively more limited than other UI frameworks, it may lack features we require but did not anticipate.</p>"},{"location":"projects/tad/adrs/0004-software-stack/#consequences","title":"Consequences","text":"<p>We have clarity about the tools to use and develop our TAD tool.</p>"},{"location":"projects/tad/adrs/0005-ai-verify-technical-tests/","title":"TAD-0005 Add support to run technical tests via AI Verify","text":""},{"location":"projects/tad/adrs/0005-ai-verify-technical-tests/#context","title":"Context","text":"<p>The AI Verify project is set up in a modular way, and the technical tests is one of the modules. The AI Verify team is developing a feature which makes it possible to run the technical tests using an API: a Python library with a method to run a test and providing the required configuration; for example, which  model and dataset to use and some test specific configuration.</p> <p>The result of the test are returned in a JSON format, which can be processed in any way we please, like writing it to a file or System Card or store it in a database.</p>"},{"location":"projects/tad/adrs/0005-ai-verify-technical-tests/#pros","title":"Pros","text":"<ul> <li>We have several technical tests we can use of the shelf, like SHAP, ALE, or Fairness metrics.</li> <li>Tests are set up in a generic way using interfaces which allows others, like ourselves, to create and add  their own plugins.</li> <li>Loading models, pipelines and data is done through the AI Verify toolkit, which not only streamlines this process  but also performs validation and support checks.</li> <li>We benefit from new tests being added in the future.</li> </ul>"},{"location":"projects/tad/adrs/0005-ai-verify-technical-tests/#cons","title":"Cons","text":"<ul> <li>The outcome of an AI Verify test depends on the implementation chosen by the developer of the  plugin. This means there is no control or flexibility over what data (metrics, logs etc.) is  included in the output.</li> <li>Adding our own plugins may require adding AI Verify frontend components we don't use ourselves.</li> <li>We are dependent on the AI Verify ecosystem for supported models and data formats. However, they are  (like us) planning to expand model support to also include ONNX and we can contribute ourselves to support  a wider range of models and data formats.</li> <li>Running pipeline tests requires familiarity with the toolkit's pipeline handling mechanisms.</li> </ul>"},{"location":"projects/tad/adrs/0005-ai-verify-technical-tests/#assumptions","title":"Assumptions","text":"<ul> <li>We can wrap the API and other AI Verify requirements in a Docker image.</li> <li>We can run the Docker image independently where we only have to provide the model, datasets and other  required configuration to run a test.</li> <li>We can monitor the progress of a test in our TAD tool.</li> <li>We can process the results of a test in our TAD tool.</li> </ul>"},{"location":"projects/tad/adrs/0005-ai-verify-technical-tests/#decision","title":"Decision","text":"<p>Our technical tests will include, but may extend beyond, those offered by AI Verify.</p>"},{"location":"projects/tad/adrs/0005-ai-verify-technical-tests/#risks","title":"Risks","text":"<p>The tests we use from AI Verify are tied to the AI Verify ecosystem. So it uses their (core) modules to load models and datasets. Adding support for other models or data formats, like models written in R, has to be done in the AI Verify core.</p>"},{"location":"projects/tad/adrs/0005-ai-verify-technical-tests/#consequences","title":"Consequences","text":"<p>We have a set of technical tests we can integrate in the TAD tool.</p>"},{"location":"projects/tad/adrs/0006-extend-system-card-EU-AI-Act/","title":"TAD-0006 Include EU AI Act into System Card","text":""},{"location":"projects/tad/adrs/0006-extend-system-card-EU-AI-Act/#context","title":"Context","text":"<p>The European Union AI Act represents a landmark regulatory framework aimed at ensuring the safe and ethical development and deployment of artificial intelligence technologies within the EU. It defines different policies and requirements for AI systems based on their risk levels, from minimal to unacceptable, to mitigate potential harms. Only for high-risk AI systems, an extended form of documentation is required, including technical documentation. This technical documentation consists of a general description of the AI system and a more detailed, in-depth description (including risk-management, monitoring, etc.).</p> <p>To ensure that AI systems can be effectively audited, we aim to create a separate instrument called 'technical documentation for high-risk AI systems'. This will allow developers to easily extract and auditors to readily assess all necessary information for the technical documentation.</p> <p>The RegCheck AI tool published by Hugging Face, checks model cards for compliance with the EU AI Act. However, this prototype tool is research work and not a commercial or legal product. Furthermore, because we use a modified model card setup, the performance may be less reliable.</p>"},{"location":"projects/tad/adrs/0006-extend-system-card-EU-AI-Act/#assumptions","title":"Assumptions","text":"<ul> <li>There is no existing standard for including information on the EU AI Act for high-risk AI systems into a system card.</li> <li>We assume that the EU AI Act is about a whole AI system, that can include multiple AI models.</li> </ul>"},{"location":"projects/tad/adrs/0006-extend-system-card-EU-AI-Act/#decision","title":"Decision","text":"<ul> <li>We include the general description cases of the EU AI Act for high-risk systems into the system card directly.</li> <li>We create a separate instrument including the complete technical documentation into the instrument registry.</li> </ul>"},{"location":"projects/tad/adrs/0006-extend-system-card-EU-AI-Act/#risks","title":"Risks","text":"<ul> <li>In the case of a high-risk algorithm, the general information can be found in two places, the system card itself and in the assessment card.</li> <li>The system card may become too elaborate when we include the general description fields.</li> </ul>"},{"location":"projects/tad/adrs/0006-extend-system-card-EU-AI-Act/#consequences","title":"Consequences","text":"<p>The extended system card and proposed instrument will facilitate the documentation of information in accordance with the EU AI Act using the TAD tool.</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/","title":"ALTAI","text":"<p>See the introduction. It is a discussion tool about AI Systems.</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#functionality","title":"Functionality","text":"Requirement Priority Fulfilled Comments The tool allows users to conduct technical tests on algorithms or models, including assessments of performance, bias, and fairness. To facilitate these tests, users can input relevant datasets, M 0 The tool only allows for discussions not technical tests The tool allows users to choose which tests to perform. M 0 See above The tool allows users to fill out questionnaires to conduct impact assessments for AI. For example IAMA or ALTAI. M 1 This is very well supported by the tool The tool can generate a human readable report. M 0.9 There is an export functionality for the outcomes of the assessment, it offers a print dialog The tools works with a standardized report format, that it can read, write, and update. M 0 This report cannot be re-imported in a different tool as it only exports to pdf The tool supports plugin functionality so additional tests can be added easily. S 0 Not applicable The tool allows to create custom reports based on components. S 0 The report cannot be customized by the user It is possible to add custom components for reports. S 0 See above The tool provides detailed logging, including tracking of different model versions, changes in impact assessments, and technical test results for individual runs. S 0.75 There is even for the users an extensive audit trail what happened to assessment, not different model versions The tool supports saving progress. S 1 Yes this is supported The tool can be used on an isolated system without an internet connection. S 1 Yes it can be ran locally or in a docker container without internet The tool offers options to discuss and document conversations. For example, to converse about technical tests or to collaborate on impact assessments. C 1 This is the main feature of the tool The tool operates with complete data privacy; it does not share any data or logging information. C 1 Stored locally in a mongoDB The tool allows extension of report formats functionality. C 0.5 It could be developed that we export to markdown instead of pdf, but right now it just prints the window as pdf The tool can be integrated in a CI/CD flow. C 0 It is an UI tool, so doesn't make sense in a CI/CD pipeline The tool can be offered as a (cloud) service where no local installation is required. C 1 We could host this tool for other parties to use It is possible to define and automate workflows for repetitive tasks. C 0 It is an UI tool The tool offers pre-built connectors or low-code/no-code integration options to simplify the integration process. C 0 No <p>total_score = 22.85</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#reliability","title":"Reliability","text":"Requirement Priority Fulfilled Comments The tool operates consistently and reliably, meaning it delivers the same expected results every time you use it. M 1 Yes The tool recovers automatically from common failures. S 1 The tool seems too do this The tool recovers from failures quickly, minimizing data loss, for example by automatically saving intermediate test progress results. S 1 The data is stored in mongoDB, so no data is lost The tool handles errors gracefully and informs users of any issues. S 1 If the email server is down the tool still operates The tool provides clear error messages and instructions for troubleshooting. S 0.8 Some errors are not very informative when you get them, but mostly email related are <p>total_score = 15.4</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#usability","title":"Usability","text":"Requirement Priority Fulfilled Comments The tool possess a clean, intuitive, and visually appealing UI that follows industry standards. S 1 Very clean UI The tool provides clear and consistent navigation, making it easy for users to find what they need. S 1 Compared to AIVerify the navigation is very intuitive (but it also has less features) The tool is responsive and provides instant feedback. S 1 Yes The user interface is multilingual and supports at least English. S 0.8 There is support for multilingual, but the assessments are not translated and needs to be translated by hand The tool offers keyboard shortcuts for efficient interaction. C 0 No The user interface can easily be translated into other languages. C 0.8 The buttons are automatically translated but not the assessment itself <p>total_score = 13</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#help-documentation","title":"Help &amp; Documentation","text":"Requirement Priority Fulfilled Comments The tool provides comprehensive online help documentation with searchable functionalities. S 0.1 There is little documentation, only the website and the github readme The tool offers context-sensitive help within the application. C 0 The icons are just very clear, would be nice to have a question mark at some places The online documentation includes video tutorials and training materials for ease of learning. C 0 There is no such documentation The project provides readily available customer support through various channels  (e.g., email, phone, online chat) to address user inquiries and troubleshoot issues. C 0.25 You can issue tickets on Github, no other way supported way <p>total_score = 0.55</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#performance-efficiency","title":"Performance Efficiency","text":"Requirement Priority Fulfilled Comments The tool operates efficiently and minimize resource utilization. M 1 The docker container is not so very big, also doesn't use much resources The tool responds to user actions instantly. M 1 There is instant feedback in the UI The tool is scalable to accommodate increased user base and data volume. S 1 As it runs on Docker, you can scale this on Kubernetes for multiple users <p>total_score = 11</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#maintainability","title":"Maintainability","text":"Requirement Priority Fulfilled Comments The tool is easy to modify and maintain. M 0.8 You need to be a bit aware of NextJS, then it is easy to maintain as it is not such a large tool The tool adheres to industry coding standards and best practices to ensure code quality and maintainability. M 0.8 The code looks well structured, they have deployments on github but I don't see any CI or pre-commit hooks The code is written in a common, widely adopted and supported and actively used and maintained programming language. M 1 NextJS is very common for frontend tools The project provides version control for code changes and rollback capabilities. M 1 The code is hosted on Github so yes The project is open source. M 1 see above It is possible to contribute to the source. S 1 It is possible, not many people have done this yet The system is modular, allowing for easy modification of individual components. S 0.6 Extra assessments can be appended to the system, but not in such a way that it supports multiple (different) assessments, but roles can be changed very easily Diagnostic tools are available to identify and troubleshoot issues. S 0.8 The standard NextJS tools to troubleshoot, but not many tests <p>total_score = 25.6</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#security","title":"Security","text":"Requirement Priority Fulfilled Comments The tool must protect data and system from unauthorized access, use, disclosure, disruption, modification, or destruction. M 1 The data is stored in MongoDB Regular security audits and penetration testing are conducted. S 0 When running docker compose up, the docker client will tell there are quite some CVE vulnerabilities in there, an upgrade of the Node version would help much here The tool enforce authorization controls based on user roles and permissions, restricting access to sensitive data and functionalities. C 0.5 The tool has support for multiple users and roles (but we couldn't find a user management system) Data encryption is used for sensitive information at rest and in transit. C 1 When data is transferred to mongoDB, a secure connection is set-up and also in the DB it is encrypted by MongoDB, also you have an SSL connection with the tool The project allows for regular security audits and penetration testing to identify vulnerabilities and ensure system integrity. C 1 The tool does allow this, as it is open-source The tool implements backup functionality to ensure data availability in case of incidents. C 1 The data is store in a volume next to the main container of the <p>total_score = 7.5</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#compatibility","title":"Compatibility","text":"Requirement Priority Fulfilled Comments The tool is compatible with existing systems and infrastructure. M 1 As it is a container it can run on Kubernetes and therefore at Digilab The tool supports industry-standard data formats and protocols. M 1 Assessment and other config are stored in JSON The tool operates seamlessly on supported operating systems and hardware platforms. S 1 As it runs in a container it is able to run on all the major OSes if you have Docker Desktop or use a cloud version managed by yourself The tool supports commonly used data formats (e.g., CSV, Excel, JSON) for easy data exchange with other systems and tools. S 0 The tool currently only exports a pdf which is not an exchangeable format The tool integrates with existing security solutions. C 0 Not applicable as it is an UI <p>total_score = 11</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#accessibility","title":"Accessibility","text":"Requirement Priority Fulfilled Comments The tool is accessible to users with disabilities, following relevant accessibility standards (e.g., WCAG). S 0.1 The color scheme is pretty good viewable, but for the rest there are not accessibility features <p>total_score = 0.3</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#portability","title":"Portability","text":"Requirement Priority Fulfilled Comments The tool support a range of operating systems (e.g., Windows, macOS, Linux) commonly used within an organization. S 1 It is in docker so can run everywhere The tool minimizes dependencies on specific hardware or software configurations, promoting flexibility. S 1 This is all containerized The tool offers a cloud-based deployment option or be compatible with cloud environments for scalability and accessibility. S 1 As it is containerized we could host this ourselves in a cloud environment, the Belgium government does not offer a hosted version for you The tool adheres to relevant cloud security standards and best practices. S 0.8 The docker container does contain some outdated versions of for example Node. <p>total_score = 11.4</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#deployment","title":"Deployment","text":"Requirement Priority Fulfilled Comments The tool has an easy and user-friendly installation and configuration process. S 1 It was very easy to install out-of-the-box The tool has on-premise or cloud-based deployment options to cater to different organizational needs and infrastructure. S 0 The tool does not promise on-prem or cloud-based managed deployments <p>total_score = 3</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#legal-compliance","title":"Legal &amp; Compliance","text":"Requirement Priority Fulfilled Comments It is clear how the tool is funded to avoid improper influence due to conflicts of interest M 1 It is funded by the Belgian Government The tool is compliant with relevant legal and regulatory requirements. S 1 Yes EU license The tool adheres to (local) data privacy regulations like GDPR, ensuring the protection of user data. S 1 Data is stored locally The tool implements appropriate security measures to comply with industry regulations and standards. S 1 EUPL 1.2 license (although they say they have MIT license) The tool is licensed for use within the organization according to the terms and conditions of the license agreement. S 1 Yes, see above The tool respects intellectual property rights and avoid copyright infringement issues. S 1 Yes, see above <p>total_score = 19</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/","title":"AI Verify","text":"<p>See the introduction</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#functionality","title":"Functionality","text":"Requirement Priority Fulfilled Comments The tool allows users to conduct technical tests on algorithms or models, including assessments of performance, bias, and fairness. To facilitate these tests, users can input relevant datasets, M 1 This is core functionality of AIVerify The tool allows users to choose which tests to perform. M 1 This is core functionality of AIVerify The tool allows users to fill out questionnaires to conduct impact assessments for AI. For example IAMA or ALTAI. M 1 This is core functionality of AIVerify, however work is needed to add extra impact assessments The tool can generate a human readable report. M 1 This is core functionality of AIVerify The tools works with a standardized report format, that it can read, write, and update. M 0 The outputted format is a PDF format, so this cannot be updated, or easily read by a machine. The tool supports plugin functionality so additional tests can be added easily. S 0.5 One can add a test as a plugin, it can however be a bit too technical still for many people. The tool allows to create custom reports based on components. S 1 One can slide the technical tests results and the assessment test results into a report which will be placed into a PDF It is possible to add custom components for reports. S 1 It is possible, but just like with tests can be hard for non-technical people The tool provides detailed logging, including tracking of different model versions, changes in impact assessments, and technical test results for individual runs. S 0.5 There are versions of models when uploaded, and the report itself is the technical test result of a run. Changes to impact assessments are not logged (only when a report is generated) The tool supports saving progress. S 1 Reports can be saved, while it is being constructed The tool can be used on an isolated system without an internet connection. S 1 Locally the docker container can be build and ran The tool offers options to discuss and document conversations. For example, to converse about technical tests or to collaborate on impact assessments. C 0 Only the end-result will be logged into the report The tool operates with complete data privacy; it does not share any data or logging information. C 1 The application is a docker application and does not do this The tool allows extension of report formats functionality. C 1 We could program this functionality in the tool and submit a PR The tool can be integrated in a CI/CD flow. C 0.5 It is possible, but would be very heavy to do so. The build time is quite large, and only the technical tests could be ran in an automated fashion The tool can be offered as a (cloud) service where no local installation is required. C 0 AIVerify is currently not doing this, we could however offer it as a cloud service It is possible to define and automate workflows for repetitive tasks. C 0 As this tool is focused on UI, this is not possible The tool offers pre-built connectors or low-code/no-code integration options to simplify the integration process. C 0 This is not included <p>total_score = 36</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#reliability","title":"Reliability","text":"Requirement Priority Fulfilled Comments The tool operates consistently and reliably, meaning it delivers the same expected results every time you use it. M 1 The tool did not break down a single time while we were coding a plugin (only threw errors) The tool recovers automatically from common failures. S 1 Common failures like missing datasets or models are not breaking The tool recovers from failures quickly, minimizing data loss, for example by automatically saving intermediate test progress results. S 0.5 The assessments you need to manually save otherwise it will be lost, but over different sessions the data will be stored persistent even if the containers go down. Test results are only stored in the generated report The tool handles errors gracefully and informs users of any issues. S 1 When failed to generate a report the tool will log the error messages, otherwise when loading in data that is non existing the application (while not being very clear in error message) just continues with an error The tool provides clear error messages and instructions for troubleshooting. S 0.5 The test-engine-core is a dependency that is installed as a package, and therefore the error message will not contain error in that package <p>total_score = 13</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#usability","title":"Usability","text":"Requirement Priority Fulfilled Comments The tool possess a clean, intuitive, and visually appealing UI that follows industry standards. S 1 The tool does follow the material design principles for example when you hover over items they will respond to user input The tool provides clear and consistent navigation, making it easy for users to find what they need. S 0.5 It is not completely clear where in the tool you are when interacting with it and sometimes you could go back to home but not always The tool is responsive and provides instant feedback. S 1 Even for jobs like generating tests and the report, it scheduled jobs and will notify you when it is done The user interface is multilingual and supports at least English. S 0.5 Currently it only supports english The tool offers keyboard shortcuts for efficient interaction. C 0 It is mainly UI and therefore no keyboard shortcuts The user interface can easily be translated into other languages. C 0.2 It would need quite some refactoring when adding support for the Dutch Language (especially the more technical words like Warning or the metadata on all the plugins <p>total_score = 9.4</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#help-documentation","title":"Help &amp; Documentation","text":"Requirement Priority Fulfilled Comments The tool provides comprehensive online help documentation with searchable functionalities. S 0.8 From the end-user perspective yes, from the development perspective no (for example that you need to rebuild packages like the test-engine-core The tool offers context-sensitive help within the application. C 0 Not included in the tool The online documentation includes video tutorials and training materials for ease of learning. C 0 Although it contains many images The project provides readily available customer support through various channels  (e.g., email, phone, online chat) to address user inquiries and troubleshoot issues. C 0.2 Just email, which they do not respond to very quickly <p>total_score = 2.8</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#performance-efficiency","title":"Performance Efficiency","text":"Requirement Priority Fulfilled Comments The tool operates efficiently and minimize resource utilization. M 0.5 The tool is efficient, minimal waiting and no lag although it uses up quite some resources which could be optimized The tool responds to user actions instantly. M 1 Instantaneous response time The tool is scalable to accommodate increased user base and data volume. S 0.5 As it is built into a container it can be made scalable with Kubernetes, but the the tool itself can become very slow when generating results for a large dataset and model (because of the extra overhead) <p>total_score = 7.5</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#maintainability","title":"Maintainability","text":"Requirement Priority Fulfilled Comments The tool is easy to modify and maintain. M 0.2 Adding a new plugin for a model type was quite hard, other plugins however are more easier The tool adheres to industry coding standards and best practices to ensure code quality and maintainability. M 0.2 The docker side of the project could have a big improvement The code is written in a common, widely adopted and supported and actively used and maintained programming language. M 1 Backend in Python, Frontend in NextJs The project provides version control for code changes and rollback capabilities. M 0.8 The code is stored on Github, but the container itself not and also the packages which the tools depend on not The project is open source. M 1 Github link It is possible to contribute to the source. S 0.5 It is possible, although with our three features it takes a while for them to dedicated time for integration The system is modular, allowing for easy modification of individual components. S 0.5 The technical tests and assessments are easy to adjust, other core features not Diagnostic tools are available to identify and troubleshoot issues. S 0 Diagnosing some parts of the system took us quite some time as we couldn't properly debug in the containers <p>total_score = 15.8</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#security","title":"Security","text":"Requirement Priority Fulfilled Comments The tool must protect data and system from unauthorized access, use, disclosure, disruption, modification, or destruction. M 0.5 This managed by that the data is stored in MongoDB however, it currently only has 1 user support Regular security audits and penetration testing are conducted. S 0.1 We are unaware of the security audits but they do have a security policy here The tool enforce authorization controls based on user roles and permissions, restricting access to sensitive data and functionalities. C 0 Currently only 1 user can use the system and see all the data Data encryption is used for sensitive information at rest and in transit. C 1 When data is transferred to mongoDB, a secure connection is set-up and also in the DB it is encrypted by MongoDB, also you have an SSL connection with the tool The project allows for regular security audits and penetration testing to identify vulnerabilities and ensure system integrity. C 1 As you can install it locally, this is possible The tool implements backup functionality to ensure data availability in case of incidents. C 1 Data is stored persistent, so even if the tool breaks the data will be in volumes <p>total_score = 8.3</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#compatibility","title":"Compatibility","text":"Requirement Priority Fulfilled Comments The tool is compatible with existing systems and infrastructure. M 1 As it is a container it can run on Kubernetes and therefore at Digilab The tool supports industry-standard data formats and protocols. M 1 Most Datasets and Models are supported by the tool The tool operates seamlessly on supported operating systems and hardware platforms. S 1 As it runs in a container it is able to run on all the major OS'es if you have Docker Desktop or use a cloud version managed by yourself The tool supports commonly used data formats (e.g., CSV, Excel, JSON) for easy data exchange with other systems and tools. S 0.5 As input many types are accepted, but only as export there is a PDF report The tool integrates with existing security solutions. C 0 It does not integrate with security solutions <p>total_score = 12.5</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#accessibility","title":"Accessibility","text":"Requirement Priority Fulfilled Comments The tool is accessible to users with disabilities, following relevant accessibility standards (e.g., WCAG). S 0 It is not clear what the tool actually does with one look, also the color change when hovering over elements is not a large difference compared to the original color (the purple and pink) <p>total_score = 0</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#portability","title":"Portability","text":"Requirement Priority Fulfilled Comments The tool support a range of operating systems (e.g., Windows, macOS, Linux) commonly used within an organization. S 1 It is containerized The tool minimizes dependencies on specific hardware or software configurations, promoting flexibility. S 1 This is all containerized The tool offers a cloud-based deployment option or be compatible with cloud environments for scalability and accessibility. S 1 As it is containerized we could host this ourselves in a cloud environment The tool adheres to relevant cloud security standards and best practices. S 0.5 The making of the container it self is lacking some best practices, otherwise the cloud security standards are not applicable as it is a self-hosted tool <p>total_score = 10.5</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#deployment","title":"Deployment","text":"Requirement Priority Fulfilled Comments The tool has an easy and user-friendly installation and configuration process. S 0.5 You need to be technical to be able to install and deploy, but then it is relatively easy The tool has on-premise or cloud-based deployment options to cater to different organizational needs and infrastructure. S 0 The tool does not promise on-prem or cloud-based managed deployments <p>total_score = 1.5</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#legal-compliance","title":"Legal &amp; Compliance","text":"Requirement Priority Fulfilled Comments It is clear how the tool is funded to avoid improper influence due to conflicts of interest M 1 On the website it is stated, that many commercial partners fund this project The tool is compliant with relevant legal and regulatory requirements. S 1 The tool adheres to (local) data privacy regulations like GDPR, ensuring the protection of user data. S 1 The tool implements appropriate security measures to comply with industry regulations and standards. S 1 The tool is licensed for use within the organization according to the terms and conditions of the license agreement. S 1 Apache 2.0 license The tool respects intellectual property rights and avoid copyright infringement issues. S 1 <p>total_score = 19</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/","title":"Holistic AI","text":"<p>See the introduction. It is a toolkit just like IBM-360-Toolkit for a data scientist to research bias and also to mitigate it immediately.</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#functionality","title":"Functionality","text":"Requirement Priority Fulfilled Comments The tool allows users to conduct technical tests on algorithms or models, including assessments of performance, bias, and fairness. To facilitate these tests, users can input relevant datasets, M 1 The tests which can be executed are written here The tool allows users to choose which tests to perform. M 1 In code the user is free to choose any test The tool allows users to fill out questionnaires to conduct impact assessments for AI. For example IAMA or ALTAI. M 0 The tool only does technical tests The tool can generate a human readable report. M 0 The toolkit itself cannot make a human readable report, it only generates results which then needs to be interpreted The tools works with a standardized report format, that it can read, write, and update. M 0 The only format it outputs are specific numbers, so no standardized format or even een report format The tool supports plugin functionality so additional tests can be added easily. S 0 All the bias tests are put in a single script which making additional tests a bit cumbersome and leas developer-friendly The tool allows to create custom reports based on components. S 0 Does not allow reports export It is possible to add custom components for reports. S 0 Does not allow reports export The tool provides detailed logging, including tracking of different model versions, changes in impact assessments, and technical test results for individual runs. S 0 Not ouf of the box, but this could be written in code by the owner of the algorithm The tool supports saving progress. S 0 Not ouf of the box, but this could be written in code by the owner of the algorithm The tool can be used on an isolated system without an internet connection. S 1 As a python tool this is possible The tool offers options to discuss and document conversations. For example, to converse about technical tests or to collaborate on impact assessments. C 0 This is not supported The tool operates with complete data privacy; it does not share any data or logging information. C 1 The local tool does not share anything to the outside world The tool allows extension of report formats functionality. C 0 This is not what the tool is built for The tool can be integrated in a CI/CD flow. C 1 As it is a python package it can be included in a CI pipeline The tool can be offered as a (cloud) service where no local installation is required. C 0 Not immediately, an UI needs to be build around it It is possible to define and automate workflows for repetitive tasks. C 1 Automated tests could be programmed specifically from this tool The tool offers pre-built connectors or low-code/no-code integration options to simplify the integration process. C 0 Not supported by the tool <p>total_score = 17</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#reliability","title":"Reliability","text":"Requirement Priority Fulfilled Comments The tool operates consistently and reliably, meaning it delivers the same expected results every time you use it. M 1 The tool recovers automatically from common failures. S 1 The tool recovers from failures quickly, minimizing data loss, for example by automatically saving intermediate test progress results. S 1 The tool handles errors gracefully and informs users of any issues. S 1 The tool provides clear error messages and instructions for troubleshooting. S 1 <p>total_score = 16</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#usability","title":"Usability","text":"Requirement Priority Fulfilled Comments The tool possess a clean, intuitive, and visually appealing UI that follows industry standards. S 0 There is no user-interface The tool provides clear and consistent navigation, making it easy for users to find what they need. S 0 There is no user-interface The tool is responsive and provides instant feedback. S 0 There is no user-interface The user interface is multilingual and supports at least English. S 0 There is no user-interface The tool offers keyboard shortcuts for efficient interaction. C 0 There is no user-interface The user interface can easily be translated into other languages. C 0 There is no user-interface <p>total_score = 0</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#help-documentation","title":"Help &amp; Documentation","text":"Requirement Priority Fulfilled Comments The tool provides comprehensive online help documentation with searchable functionalities. S 0.2 There is some documentation but it is not very helpful The tool offers context-sensitive help within the application. C 0 As a Python tool, no The online documentation includes video tutorials and training materials for ease of learning. C 0 Ths is not there The project provides readily available customer support through various channels  (e.g., email, phone, online chat) to address user inquiries and troubleshoot issues. C 0.5 You can contact sales through their website and respond on Github, Github seems to be an okay response time (but not a large community) <p>total_score = 1.6</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#performance-efficiency","title":"Performance Efficiency","text":"Requirement Priority Fulfilled Comments The tool operates efficiently and minimize resource utilization. M 1 very lightweight as a python package The tool responds to user actions instantly. M 1 It will return output instantly The tool is scalable to accommodate increased user base and data volume. S 1 This would be installed distributed and therefore would be scalable, with large datasets it is still very quick <p>total_score = 11</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#maintainability","title":"Maintainability","text":"Requirement Priority Fulfilled Comments The tool is easy to modify and maintain. M 0.5 It is less modular because most of the tests are written in a single script The tool adheres to industry coding standards and best practices to ensure code quality and maintainability. M 0.5 They use pre-commit hooks, but the codebase seems to be a bit weirdly structured The code is written in a common, widely adopted and supported and actively used and maintained programming language. M 1 It is written in Python The project provides version control for code changes and rollback capabilities. M 1 It is hosted on Github The project is open source. M 1 Hosted here It is possible to contribute to the source. S 1 It is possible and they respond to contributions The system is modular, allowing for easy modification of individual components. S 0.5 See the first point Diagnostic tools are available to identify and troubleshoot issues. S 1 Just standard python troubleshooting tools <p>total_score = 23.5</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#security","title":"Security","text":"Requirement Priority Fulfilled Comments The tool must protect data and system from unauthorized access, use, disclosure, disruption, modification, or destruction. M 0 Not applicable Regular security audits and penetration testing are conducted. S 0 It is not stated on the repository that they do something with security The tool enforce authorization controls based on user roles and permissions, restricting access to sensitive data and functionalities. C 0 The tool does not have Users or Access control Data encryption is used for sensitive information at rest and in transit. C 0 Transitionary data is not stored The project allows for regular security audits and penetration testing to identify vulnerabilities and ensure system integrity. C 1 This is not blocked by the tool The tool implements backup functionality to ensure data availability in case of incidents. C 0 Not supported <p>total_score = 2</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#compatibility","title":"Compatibility","text":"Requirement Priority Fulfilled Comments The tool is compatible with existing systems and infrastructure. M 1 It can be imported in Python The tool supports industry-standard data formats and protocols. M 0 it does not standardize at all in the output of the tests The tool operates seamlessly on supported operating systems and hardware platforms. S 1 Python can be ran on any system The tool supports commonly used data formats (e.g., CSV, Excel, JSON) for easy data exchange with other systems and tools. S 1 If it can be imported in Python/R it is supported The tool integrates with existing security solutions. C 0 Not applicable <p>total_score = 10</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#accessibility","title":"Accessibility","text":"Requirement Priority Fulfilled Comments The tool is accessible to users with disabilities, following relevant accessibility standards (e.g., WCAG). S 0 You need to be a programmer to use it, and that is not your typical user with disabilities <p>total_score = 0</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#portability","title":"Portability","text":"Requirement Priority Fulfilled Comments The tool support a range of operating systems (e.g., Windows, macOS, Linux) commonly used within an organization. S 0.5 As it is a python tool it is supported anywhere python runs The tool minimizes dependencies on specific hardware or software configurations, promoting flexibility. S 1 It is a python tool The tool offers a cloud-based deployment option or be compatible with cloud environments for scalability and accessibility. S 1 The company behind Holistic AI offers a whole range of services included an UI which uses this open-source toolkit The tool adheres to relevant cloud security standards and best practices. S 0 On their website they do not speak about where the data of their solution will go, this is not very transparent <p>total_score = 7.5</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#deployment","title":"Deployment","text":"Requirement Priority Fulfilled Comments The tool has an easy and user-friendly installation and configuration process. S 0.2 You need to have some developer knowledge and also knowledge about the technical tests to use The tool has on-premise or cloud-based deployment options to cater to different organizational needs and infrastructure. S 1 Yes the tool can be used as a cloud-based deployment but then with a whole UI around it <p>total_score = 3.6</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#legal-compliance","title":"Legal &amp; Compliance","text":"Requirement Priority Fulfilled Comments It is clear how the tool is funded to avoid improper influence due to conflicts of interest M 1 The tool is owned by a private company but has been made open source to the public The tool is compliant with relevant legal and regulatory requirements. S 1 Under the apache 2.0 license The tool adheres to (local) data privacy regulations like GDPR, ensuring the protection of user data. S 1 Data stays locally The tool implements appropriate security measures to comply with industry regulations and standards. S 0 The repo does not speak about security at all The tool is licensed for use within the organization according to the terms and conditions of the license agreement. S 1 Under the apache 2.0 license The tool respects intellectual property rights and avoid copyright infringement issues. S 1 <p>total_score = 16</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/","title":"IBM Research 360 Toolkit","text":"<p>See the introduction, same thing as verifyML this has no frontend baked in, but has some nice integrations with MLops tooling like Kubeflow Pipelines. The IBM Research 360 toolkit is actually a collection of three open-source toolkits as stated by their Github repo; AI Fairness 360, AI Explainability 360, Adversarial Robustness 360. The strong suite of this toolkit that it considers bias in the whole lifecycle of the model; (dataset, training, output).</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#functionality","title":"Functionality","text":"Requirement Priority Fulfilled Comments The tool allows users to conduct technical tests on algorithms or models, including assessments of performance, bias, and fairness. To facilitate these tests, users can input relevant datasets, M 1 Fairness, Explainability and security can be tested with the suite of tools The tool allows users to choose which tests to perform. M 1 The websites of contain a whole explanation of which tests to perform AIF Website, AIX website, ART website The tool allows users to fill out questionnaires to conduct impact assessments for AI. For example IAMA or ALTAI. M 0 The tool only does technical tests The tool can generate a human readable report. M 0 The toolkit itself cannot make a human readable report, it only generates results which then needs to be interpreted The tools works with a standardized report format, that it can read, write, and update. M 0 The only format it outputs are specific numbers, so no standardized format or even een report format The tool supports plugin functionality so additional tests can be added easily. S 1 Only the repository new tests could be added quite easily if you understand Python The tool allows to create custom reports based on components. S 0 The tool does not generate reports It is possible to add custom components for reports. S 0 The tool does not generate reports The tool provides detailed logging, including tracking of different model versions, changes in impact assessments, and technical test results for individual runs. S 0 Not ouf of the box, but this could be written in code by the owner of the algorithm The tool supports saving progress. S 0 Not ouf of the box, but this could be written in code by the owner of the algorithm The tool can be used on an isolated system without an internet connection. S 1 As it can be imported as a python or r library The tool offers options to discuss and document conversations. For example, to converse about technical tests or to collaborate on impact assessments. C 0 This is not supported, there is no UI The tool operates with complete data privacy; it does not share any data or logging information. C 1 The tool does not share data The tool allows extension of report formats functionality. C 0 The tool does not generate reports The tool can be integrated in a CI/CD flow. C 1 As it is a programming toolkit it can be used in a CI/CD pipeline The tool can be offered as a (cloud) service where no local installation is required. C 0 not immediately, then an UI needs to be made It is possible to define and automate workflows for repetitive tasks. C 1 We could automate specific tests which we deem necessary or standard The tool offers pre-built connectors or low-code/no-code integration options to simplify the integration process. C 0 Purely written in Python <p>total_score = 20</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#reliability","title":"Reliability","text":"Requirement Priority Fulfilled Comments The tool operates consistently and reliably, meaning it delivers the same expected results every time you use it. M 1 The tool recovers automatically from common failures. S 1 The tool recovers from failures quickly, minimizing data loss, for example by automatically saving intermediate test progress results. S 1 The tool handles errors gracefully and informs users of any issues. S 1 The tool provides clear error messages and instructions for troubleshooting. S 1 <p>total_score = 16</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#usability","title":"Usability","text":"Requirement Priority Fulfilled Comments The tool possess a clean, intuitive, and visually appealing UI that follows industry standards. S 0 There is no user-interface The tool provides clear and consistent navigation, making it easy for users to find what they need. S 0 There is no user-interface The tool is responsive and provides instant feedback. S 0 There is no user-interface The user interface is multilingual and supports at least English. S 0 There is no user-interface The tool offers keyboard shortcuts for efficient interaction. C 0 There is no user-interface The user interface can easily be translated into other languages. C 0 There is no user-interface <p>total_score = 0</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#help-documentation","title":"Help &amp; Documentation","text":"Requirement Priority Fulfilled Comments The tool provides comprehensive online help documentation with searchable functionalities. S 0.8 On the website of the specific toolkit you can find many docs but you cannot search The tool offers context-sensitive help within the application. C 0 Within the application (as it is not an UI, does not offer specific help) The online documentation includes video tutorials and training materials for ease of learning. C 1 The amount of tutorials is extensive even videos of its usage The project provides readily available customer support through various channels  (e.g., email, phone, online chat) to address user inquiries and troubleshoot issues. C 1 You can ask questions at the repository, but also in slack and many people are using this <p>total_score = 6.4</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#performance-efficiency","title":"Performance Efficiency","text":"Requirement Priority Fulfilled Comments The tool operates efficiently and minimize resource utilization. M 1 very lightweight as a python package The tool responds to user actions instantly. M 1 It will return output instantly The tool is scalable to accommodate increased user base and data volume. S 1 This would be installed distributed and therefore would be scalable, with large datasets it is still very quick <p>total_score = 11</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#maintainability","title":"Maintainability","text":"Requirement Priority Fulfilled Comments The tool is easy to modify and maintain. M 1 The repositories are very well structured and therefore easy to adjust The tool adheres to industry coding standards and best practices to ensure code quality and maintainability. M 1 Although it doesn't have pre-commit hooks it does have a CONTRIBUTING.rst where the rules of good practices are written down The code is written in a common, widely adopted and supported and actively used and maintained programming language. M 1 It is written in Python The project provides version control for code changes and rollback capabilities. M 1 The code is hosted on Github The project is open source. M 1 At the beginning of this doc you can find the links to the repositories It is possible to contribute to the source. S 1 They have merged many outside requests, so this is fine The system is modular, allowing for easy modification of individual components. S 1 Tests can very easily be added if you understand Python Diagnostic tools are available to identify and troubleshoot issues. S 1 Just standard python troubleshooting tools <p>total_score = 29</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#security","title":"Security","text":"Requirement Priority Fulfilled Comments The tool must protect data and system from unauthorized access, use, disclosure, disruption, modification, or destruction. M 0 not applicable Regular security audits and penetration testing are conducted. S 0 It is not stated on the repository that they do something with security The tool enforce authorization controls based on user roles and permissions, restricting access to sensitive data and functionalities. C 0 The tool does not have Users or Access control Data encryption is used for sensitive information at rest and in transit. C 0 Transitionary data is not stored The project allows for regular security audits and penetration testing to identify vulnerabilities and ensure system integrity. C 1 This is not blocked by the tool The tool implements backup functionality to ensure data availability in case of incidents. C 0 Not supported <p>total_score = 2</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#compatibility","title":"Compatibility","text":"Requirement Priority Fulfilled Comments The tool is compatible with existing systems and infrastructure. M 1 It can easily be imported in Python or R The tool supports industry-standard data formats and protocols. M 0.5 It does not standardize really on any output from the tests The tool operates seamlessly on supported operating systems and hardware platforms. S 1 As a python and R tool it can be run on systems where these can be ran The tool supports commonly used data formats (e.g., CSV, Excel, JSON) for easy data exchange with other systems and tools. S 1 These can be used if they are imported in python and R The tool integrates with existing security solutions. C 1 The Adversarial Robustness Toolbox can be used to test for the security of AI Systems <p>total_score = 14</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#accessibility","title":"Accessibility","text":"Requirement Priority Fulfilled Comments The tool is accessible to users with disabilities, following relevant accessibility standards (e.g., WCAG). S 0 You need to be a programmer to use it, and that is not your typical user with disabilities <p>total_score = 0</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#portability","title":"Portability","text":"Requirement Priority Fulfilled Comments The tool support a range of operating systems (e.g., Windows, macOS, Linux) commonly used within an organization. S 0.7 If you can run python, which is not always possible within the government for example, but R could be more easy to be run on places The tool minimizes dependencies on specific hardware or software configurations, promoting flexibility. S 1 Just a python tool, no UI which is fairly minimal The tool offers a cloud-based deployment option or be compatible with cloud environments for scalability and accessibility. S 0 It is not offered as a cloud-based option The tool adheres to relevant cloud security standards and best practices. S 0 Not relevant <p>total_score = 5.1</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#deployment","title":"Deployment","text":"Requirement Priority Fulfilled Comments The tool has an easy and user-friendly installation and configuration process. S 0.4 You need to have some developer knowledge and also knowledge about the technical tests to use. But then it is quite easy and works fairly quickly The tool has on-premise or cloud-based deployment options to cater to different organizational needs and infrastructure. S 0 Not applicable <p>total_score = 1.2</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#legal-compliance","title":"Legal &amp; Compliance","text":"Requirement Priority Fulfilled Comments It is clear how the tool is funded to avoid improper influence due to conflicts of interest M 1 The tool was from IBM, but slowly they are removing the IBM branding from this and the tool is now owned by the LF AI Foundation (where big companies are part of) The tool is compliant with relevant legal and regulatory requirements. S 1 All three tools have apache 2.0 license The tool adheres to (local) data privacy regulations like GDPR, ensuring the protection of user data. S 1 Data will stay local The tool implements appropriate security measures to comply with industry regulations and standards. S 0 Nothing is known about the security measures of the toolkits The tool is licensed for use within the organization according to the terms and conditions of the license agreement. S 1 All three tools have apache 2.0 license The tool respects intellectual property rights and avoid copyright infringement issues. S 1 The specific tests are implementations of papers which are open for everyone <p>total_score = 16</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/","title":"VerifyML","text":"<p>See the introduction, the maker also suggests to use an front-end tool to collaboratively change the model card. Model Card Editor this is not open-source and also the developer suggests in this issue to not use this tool but to use tools like AIVerify. This checklist only looks at the verifyML python toolkit and not the web interface.</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#functionality","title":"Functionality","text":"Requirement Priority Fulfilled Comments The tool allows users to conduct technical tests on algorithms or models, including assessments of performance, bias, and fairness. To facilitate these tests, users can input relevant datasets, M 1 The tool does allow a few standardized tests, specified here The tool allows users to choose which tests to perform. M 1 In code the user is free to choose any test The tool allows users to fill out questionnaires to conduct impact assessments for AI. For example IAMA or ALTAI. M 0 The tool can generate a human readable report. M 1 The tool can visualize model cards that are generated by it The tools works with a standardized report format, that it can read, write, and update. M 1 It generates html which can be imported by a machine The tool supports plugin functionality so additional tests can be added easily. S 1 Any test can be ran by the user itself and the output imported in the model card generated by the tool The tool allows to create custom reports based on components. S 0 It doesn't offer any standardization in what to put in the report It is possible to add custom components for reports. S 1 Anything can be put in the model card, which makes it very flexible The tool provides detailed logging, including tracking of different model versions, changes in impact assessments, and technical test results for individual runs. S 0 Not ouf of the box, but this could be written in code by the owner of the algorithm The tool supports saving progress. S 1 Once the modelcard is generated it could be loaded in again and be changed The tool can be used on an isolated system without an internet connection. S 1 Once the tool is imported in python it can be used without an internet connection The tool offers options to discuss and document conversations. For example, to converse about technical tests or to collaborate on impact assessments. C 0 Assessments are not supported The tool operates with complete data privacy; it does not share any data or logging information. C 1 It does not do this The tool allows extension of report formats functionality. C 1 As it exports html, it can also be transferred to json or markdown The tool can be integrated in a CI/CD flow. C 1 The automated tests could be ran in the CI/CD tool to generated a model card The tool can be offered as a (cloud) service where no local installation is required. C 0 The python tool itself not, but a frontend which needs to be developed yes It is possible to define and automate workflows for repetitive tasks. C 1 As it is written in python this can be automated easily The tool offers pre-built connectors or low-code/no-code integration options to simplify the integration process. C 0 The tool does this not <p>total_score = 42</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#reliability","title":"Reliability","text":"Requirement Priority Fulfilled Comments The tool operates consistently and reliably, meaning it delivers the same expected results every time you use it. M 1 Once you have located the right (older) libraries it runs pretty smoothly and reliably The tool recovers automatically from common failures. S 0 Library dependencies needs to be solved by yourself as this is not handled by the tool (especially graphs) The tool recovers from failures quickly, minimizing data loss, for example by automatically saving intermediate test progress results. S 0 It does not store any intermediary results The tool handles errors gracefully and informs users of any issues. S 0 It just breaks, you need to explicitly export the model card for it to saved The tool provides clear error messages and instructions for troubleshooting. S 0 The error messages are python error messages unrelated to the tool <p>total_score = 4</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#usability","title":"Usability","text":"Requirement Priority Fulfilled Comments The tool possess a clean, intuitive, and visually appealing UI that follows industry standards. S 0 There is no user interface The tool provides clear and consistent navigation, making it easy for users to find what they need. S 0 There is no user interface The tool is responsive and provides instant feedback. S 0 There is no user interface The user interface is multilingual and supports at least English. S 0 There is no user interface The tool offers keyboard shortcuts for efficient interaction. C 0 There is no user interface The user interface can easily be translated into other languages. C 0 There is no user interface <p>total_score = 0</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#help-documentation","title":"Help &amp; Documentation","text":"Requirement Priority Fulfilled Comments The tool provides comprehensive online help documentation with searchable functionalities. S 0.5 The documentation is quite concise and helpful, but it is outdated The tool offers context-sensitive help within the application. C 0 No context info whatsoever The online documentation includes video tutorials and training materials for ease of learning. C 0 Just documentation The project provides readily available customer support through various channels  (e.g., email, phone, online chat) to address user inquiries and troubleshoot issues. C 0 The people who worked on the tool are quick to respond to issues, but they don't support the tool anymore <p>total_score = 1.5</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#performance-efficiency","title":"Performance Efficiency","text":"Requirement Priority Fulfilled Comments The tool operates efficiently and minimize resource utilization. M 1 Very lightweight tool, as it is a python package The tool responds to user actions instantly. M 1 When run, it returns instantly The tool is scalable to accommodate increased user base and data volume. S 1 This would be installed distributed and therefore would be scalable, with large datasets it is still very quick <p>total_score = 11</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#maintainability","title":"Maintainability","text":"Requirement Priority Fulfilled Comments The tool is easy to modify and maintain. M 1 The tool itself it not so large and written with tools we are all quite aware of The tool adheres to industry coding standards and best practices to ensure code quality and maintainability. M 1 The repository has poetry, pre-commit hooks, has a CI, and looks well structured The code is written in a common, widely adopted and supported and actively used and maintained programming language. M 1 in Python and jupyter notebooks The project provides version control for code changes and rollback capabilities. M 1 It is hosted on Github The project is open source. M 1 Apache 2.0 license It is possible to contribute to the source. S 0 The project is not active supported anymore, so we would need to make a fork and make that the main source The system is modular, allowing for easy modification of individual components. S 0.5 The idea of a model card is pretty modular, and can be changed any way we like. Adding assessments in the tool would be quite the effort Diagnostic tools are available to identify and troubleshoot issues. S 1 Just standard python troubleshooting tools <p>total_score = 24.5</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#security","title":"Security","text":"Requirement Priority Fulfilled Comments The tool must protect data and system from unauthorized access, use, disclosure, disruption, modification, or destruction. M 0 not applicable Regular security audits and penetration testing are conducted. S 0 As the tool is not actively maintained anymore The tool enforce authorization controls based on user roles and permissions, restricting access to sensitive data and functionalities. C 0 As this is a local import only, this is managed by the developer Data encryption is used for sensitive information at rest and in transit. C 0 Intermediary data is not stored, and the end result is put in html with no encryption The project allows for regular security audits and penetration testing to identify vulnerabilities and ensure system integrity. C 1 It does not block this for users to do this The tool implements backup functionality to ensure data availability in case of incidents. C 0 Not supported <p>total_score = 2</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#compatibility","title":"Compatibility","text":"Requirement Priority Fulfilled Comments The tool is compatible with existing systems and infrastructure. M 1 It can be easily imported and installed in python The tool supports industry-standard data formats and protocols. M 1 Standardized tests are used and the output format is html The tool operates seamlessly on supported operating systems and hardware platforms. S 1 As it is a python tool, anywhere where python can run this can also be run The tool supports commonly used data formats (e.g., CSV, Excel, JSON) for easy data exchange with other systems and tools. S 1 This can be imported The tool integrates with existing security solutions. C 0 It does not do such a thing <p>total_score = 14</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#accessibility","title":"Accessibility","text":"Requirement Priority Fulfilled Comments The tool is accessible to users with disabilities, following relevant accessibility standards (e.g., WCAG). S 0 You need to be a programmer to use it, and that is not your typical user with disabilities <p>total_score = 0</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#portability","title":"Portability","text":"Requirement Priority Fulfilled Comments The tool support a range of operating systems (e.g., Windows, macOS, Linux) commonly used within an organization. S 0.5 If you can run python, which is not always possible within the government for example The tool minimizes dependencies on specific hardware or software configurations, promoting flexibility. S 1 As it is a python tool The tool offers a cloud-based deployment option or be compatible with cloud environments for scalability and accessibility. S 0 It is not offered as a cloud-based option The tool adheres to relevant cloud security standards and best practices. S 0 On the github nothing is mentioned about security and for the cloud version it is not applicable <p>total_score = 4.5</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#deployment","title":"Deployment","text":"Requirement Priority Fulfilled Comments The tool has an easy and user-friendly installation and configuration process. S 0.2 You need to have some developer knowledge and also knowledge about the technical tests to use The tool has on-premise or cloud-based deployment options to cater to different organizational needs and infrastructure. S 0 Not applicable <p>total_score = 0.6</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#legal-compliance","title":"Legal &amp; Compliance","text":"Requirement Priority Fulfilled Comments It is clear how the tool is funded to avoid improper influence due to conflicts of interest M 1 It was developed during a competition and it does not receive funding anymore The tool is compliant with relevant legal and regulatory requirements. S 1 Under the apache 2.0 license The tool adheres to (local) data privacy regulations like GDPR, ensuring the protection of user data. S 1 Data will stay local The tool implements appropriate security measures to comply with industry regulations and standards. S 0 The repo does not speak about security at all The tool is licensed for use within the organization according to the terms and conditions of the license agreement. S 1 Under the apache 2.0 license The tool respects intellectual property rights and avoid copyright infringement issues. S 1 <p>total_score = 16</p>"},{"location":"projects/tad/existing-tools/comparison/requirements/","title":"Requirements for tools for Transparency of Algorithmic Decision making","text":"<p>This document contains a checklist with requirements for tools we could use to help with the transparency of algorithmic decision making.</p> <p>The requirements are based on:</p> <ul> <li>ISO 25010 standard: This standard defines the eight quality characteristics and provides a framework for evaluating software quality.</li> <li>Industry best practices: This includes a broad range of recommendations and guidelines for IT tool development and implementation.</li> <li>Common IT tool requirements: This information was gathered by analyzing various sources, such as documentation from popular IT tools, user reviews, and articles from reputable tech publications that discuss essential features and functionalities expected from different types of IT tools.</li> <li>Internal discussion and common sense: While above sources are already exhaustive, we also used team discussions and our own knowledge.</li> </ul>"},{"location":"projects/tad/existing-tools/comparison/requirements/#overview-of-requirements","title":"Overview of requirements","text":"<p>The requirements have been given a priority based on the MoSCoW scale to allow for tool comparison.</p>"},{"location":"projects/tad/existing-tools/comparison/requirements/#functionality","title":"Functionality","text":"Requirement Priority The tool allows users to conduct technical tests on algorithms or models, including assessments of performance, bias, and fairness. To facilitate these tests, users can input relevant datasets, M The tool allows users to choose which tests to perform. M The tool allows users to fill out questionnaires to conduct impact assessments for AI. For example IAMA or ALTAI. M The tool can generate a human readable report. M The tools works with a standardized report format, that it can read, write, and update. M The tool supports plugin functionality so additional tests can be added easily. S The tool allows to create custom reports based on components. S It is possible to add custom components for reports. S The tool provides detailed logging, including tracking of different model versions, changes in impact assessments, and technical test results for individual runs. S The tool supports saving progress. S The tool can be used on an isolated system without an internet connection. S The tool offers options to discuss and document conversations. For example, to converse about technical tests or to collaborate on impact assessments. C The tool operates with complete data privacy; it does not share any data or logging information. C The tool allows extension of report formats functionality. C The tool can be integrated in a CI/CD flow. C The tool can be offered as a (cloud) service where no local installation is required. C It is possible to define and automate workflows for repetitive tasks. C The tool offers pre-built connectors or low-code/no-code integration options to simplify the integration process. C"},{"location":"projects/tad/existing-tools/comparison/requirements/#reliability","title":"Reliability","text":"Requirement Priority The tool operates consistently and reliably, meaning it delivers the same expected results every time you use it. M The tool recovers automatically from common failures. S The tool recovers from failures quickly, minimizing data loss, for example by automatically saving intermediate test progress results. S The tool handles errors gracefully and informs users of any issues. S The tool provides clear error messages and instructions for troubleshooting. S"},{"location":"projects/tad/existing-tools/comparison/requirements/#usability","title":"Usability","text":"Requirement Priority The tool possess a clean, intuitive, and visually appealing UI that follows industry standards. S The tool provides clear and consistent navigation, making it easy for users to find what they need. S The tool is responsive and provides instant feedback. S The user interface is multilingual and supports at least English. S The tool offers keyboard shortcuts for efficient interaction. C The user interface can easily be translated into other languages. C"},{"location":"projects/tad/existing-tools/comparison/requirements/#help-documentation","title":"Help &amp; Documentation","text":"Requirement Priority The tool provides comprehensive online help documentation with searchable functionalities. S The tool offers context-sensitive help within the application. C The online documentation includes video tutorials and training materials for ease of learning. C The project provides readily available customer support through various channels  (e.g., email, phone, online chat) to address user inquiries and troubleshoot issues. C"},{"location":"projects/tad/existing-tools/comparison/requirements/#performance-efficiency","title":"Performance Efficiency","text":"Requirement Priority The tool operates efficiently and minimize resource utilization. M The tool responds to user actions instantly. M The tool is scalable to accommodate increased user base and data volume. S"},{"location":"projects/tad/existing-tools/comparison/requirements/#maintainability","title":"Maintainability","text":"Requirement Priority The tool is easy to modify and maintain. M The tool adheres to industry coding standards and best practices to ensure code quality and maintainability. M The code is written in a common, widely adopted and supported and actively used and maintained programming language. M The project provides version control for code changes and rollback capabilities. M The project is open source. M It is possible to contribute to the source. S The system is modular, allowing for easy modification of individual components. S Diagnostic tools are available to identify and troubleshoot issues. S"},{"location":"projects/tad/existing-tools/comparison/requirements/#security","title":"Security","text":"Requirement Priority The tool must protect data and system from unauthorized access, use, disclosure, disruption, modification, or destruction. M Regular security audits and penetration testing are conducted. S The tool enforce authorization controls based on user roles and permissions, restricting access to sensitive data and functionalities. C Data encryption is used for sensitive information at rest and in transit. C The project allows for regular security audits and penetration testing to identify vulnerabilities and ensure system integrity. C The tool implements backup functionality to ensure data availability in case of incidents. C"},{"location":"projects/tad/existing-tools/comparison/requirements/#compatibility","title":"Compatibility","text":"Requirement Priority The tool is compatible with existing systems and infrastructure. M The tool supports industry-standard data formats and protocols. M The tool operates seamlessly on supported operating systems and hardware platforms. S The tool supports commonly used data formats (e.g., CSV, Excel, JSON) for easy data exchange with other systems and tools. S The tool integrates with existing security solutions. C"},{"location":"projects/tad/existing-tools/comparison/requirements/#accessibility","title":"Accessibility","text":"Requirement Priority The tool is accessible to users with disabilities, following relevant accessibility standards (e.g., WCAG). S"},{"location":"projects/tad/existing-tools/comparison/requirements/#portability","title":"Portability","text":"Requirement Priority The tool support a range of operating systems (e.g., Windows, macOS, Linux) commonly used within an organization. S The tool minimizes dependencies on specific hardware or software configurations, promoting flexibility. S The tool offers a cloud-based deployment option or be compatible with cloud environments for scalability and accessibility. S The tool adheres to relevant cloud security standards and best practices. S"},{"location":"projects/tad/existing-tools/comparison/requirements/#deployment","title":"Deployment","text":"Requirement Priority The tool has an easy and user-friendly installation and configuration process. S The tool has on-premise or cloud-based deployment options to cater to different organizational needs and infrastructure. S"},{"location":"projects/tad/existing-tools/comparison/requirements/#legal-compliance","title":"Legal &amp; Compliance","text":"Requirement Priority It is clear how the tool is funded to avoid improper influence due to conflicts of interest M The tool is compliant with relevant legal and regulatory requirements. S The tool adheres to (local) data privacy regulations like GDPR, ensuring the protection of user data. S The tool implements appropriate security measures to comply with industry regulations and standards. S The tool is licensed for use within the organization according to the terms and conditions of the license agreement. S The tool respects intellectual property rights and avoid copyright infringement issues. S"},{"location":"projects/tad/existing-tools/comparison/tools/","title":"Research of tools for Transparency of Algorithmic Decision making","text":"<p>In our ongoing research on AI validation and transparency, we are seeking tools to support assessments. Ideal tools would combine various technical tests with checklists and questionnaires and have the ability to generate reports in both human-friendly and machine-exchangeable formats.</p> <p>This document contains a list of tools we have found and may want to investigate further.</p>"},{"location":"projects/tad/existing-tools/comparison/tools/#ai-verify","title":"AI Verify","text":"<p>AI Verify is an AI governance testing framework and software toolkit that validates the performance of AI systems against a set of  internationally recognized principles through standardized tests, and is consistent with international AI governance frameworks such as those from European Union, OECD and Singapore.</p> <p>Links: AI Verify Homepage, AI Verify documentation, AI Verify Github.</p>"},{"location":"projects/tad/existing-tools/comparison/tools/#to-investigate-further","title":"To investigate further","text":""},{"location":"projects/tad/existing-tools/comparison/tools/#verifyml","title":"VerifyML","text":"<p>What is it? VerifyML is an opinionated, open-source toolkit and workflow to help companies implement human-centric AI practices. It seems pretty much equivalent to AI Verify.</p> <p>Why interesting? The functionality of this toolkit seems to match closely with those of AI Verify. It has a \"git and code first approach\" and has automatic generation of model cards.</p> <p>Remarks The code seems to be last updated 2 years ago.</p> <p>Links: VerifyML, VerifyML GitHub</p>"},{"location":"projects/tad/existing-tools/comparison/tools/#ibm-research-360-toolkit","title":"IBM Research 360 Toolkit","text":"<p>What is it? Open source Python libraries that supports interpretability and explainability of datasets and machine learning models. Most relevant toolkits are the AI Fairness 360 and AI Explainability 360.</p> <p>Why interesting? Seems to encompass extensive fairness and explainability tests. Codebase seems to be active.</p> <p>Remarks It comes as Python and R libraries.</p> <p>Links: AI Fairness 360 Github, AI Explainability 360 Github.</p>"},{"location":"projects/tad/existing-tools/comparison/tools/#holistic-ai","title":"Holistic AI","text":"<p>What is it? Open source tool to assess and improve the trustworthiness of AI systems. Offers tools to measure and mitigate bias across numerous tasks. Will be extended to include tools for efficacy, robustness, privacy and explainability.</p> <p>Why interesting? Although it is not entirely clear what exactly this tool does (see Remarks) it does seem (according to their website) to provide reports on bias and fairness. The Github rep does not seem to include any report generating code, but mainly technical tests. Here is an example in which bias is measured in a classification model.</p> <p>Remarks Website seems to suggest the possibility to generate reports, but this is not directly reflected in the codebase. Possibly reports are only available with some sort of licensed product?</p> <p>Links: Holistic AI Homepage, Holistic AI Github.</p>"},{"location":"projects/tad/existing-tools/comparison/tools/#ai-assessment-tool","title":"AI Assessment Tool","text":"<p>What is it? The tool is based on the ALTAI published by the European Commission. It is more of a discussion tool about AI Systems.</p> <p>Why interesting? Although it only includes questionnaires it does give an interesting way of reporting the end results. Discussions on for example IAMA can be documented as well within the tool.</p> <p>Remarks The tool of the EU itself is not open-source but the tool from Belgium is. Does not include any technical tests at this point.</p> <p>Links: AI Assessment Tool Belgium homepage AI Assessment Tool Belgium Github</p>"},{"location":"projects/tad/existing-tools/comparison/tools/#interesting-to-mention","title":"Interesting to mention","text":"<ul> <li> <p>What-if. Provides interface for expanding understanding of a black-box classification or regression ML model. Can be accessed through TensorBoard or as an extension in a Jupyter or Colab notebook. Does not seem to be an active codebase.</p> </li> <li> <p>Aequitas. Open source bias auditing and Fair ML toolkit. This already seems to be contained within AI Verify, at least the 'fairness tree'.</p> </li> <li> <p>Facets. Open source toolkit for understanding and analyzing ML datasets. Note that does not include ML models.</p> </li> <li> <p>Fairness Indicators. Open source Python package which enables easy computation of commonly-identified fairness metrics for binary and multiclass classifiers. Part of TensorFlow. k</p> </li> <li> <p>Fairlearn. Open source Python package that empowers developers of AI systems to assess their system's fairness and mitigate any observed unfairness issues.</p> </li> <li> <p>Dalex. The DALEX package x-rays any model and helps to explore and explain its behavior, helps to understand how complex models are working. The main function explain() creates a wrapper around a predictive model. Wrapped models may then be explored and compared with a collection of local and global explainers. Recent developments from the area of Interpretable Machine Learning/eXplainable Artificial Intelligence.</p> </li> <li> <p>SigmaRed. SigmaRed platform enables comprehensive third-party AI risk management (AI TPRM) and rapidly reduces the cycle time of conducting AI risks assessments while providing deep visibility, control, stakeholder based reporting, and detailed evidence repository. Does not seem to be open source.</p> </li> <li> <p>Anch.ai. The end-to-end cloud solution empowers global data-driven organizations to govern and deploy responsible, transparent, and explainable AI aligned with upcoming EU regulation AI Act. Does not seem to be open source.</p> </li> <li> <p>CredoAI. Credo AI is an AI governance platform that helps companies adopt, scale, and govern AI safely and effectively. Does not seem to be open source.</p> </li> </ul>"},{"location":"projects/tad/existing-tools/comparison/tools/#the-fate-system","title":"The FATE system","text":"<p>Paper by TNO about the FATE system. Acronym stands for \"FAir, Transparent and Explainable Decision Making.\"</p> <p>Tools mentioned include some of the above: Aequitas, AI Fairness 360, Dalex, Fairlearn, Responsibly, and What-If-Tool</p> <p>Links: Paper, Article, Microsoft links.</p>"},{"location":"projects/tad/existing-tools/comparison/tools_comparison/","title":"Comparison of tools for transparency of algorithmic decision making","text":"<p>We have researched a few tools which we want to investigate further, this document is the next step in that investigation. We created a checklist to compare these tools against. The Fulfilled column will give a numerical value based on whether that requirement is fulfilled or not between 0 and 1. Then the actual scoring is the fulfilled value times the priority (the priority is translated to numerical values in the following way: {M:4, S:3, C:2, W:-1}).</p>"},{"location":"projects/tad/existing-tools/comparison/tools_comparison/#summary-of-the-comparison","title":"Summary of the comparison","text":"Requirement AIVerify VerifyML IBM 360 Research Toolkit Holistic AI AI Assessment Tool Functionality 36 42 20 17 22.85 Reliability 13 4 16 16 15.4 Usability 9.4 0 0 0 13 Help &amp; Documentation 2.8 1.5 6.4 1.6 0.55 Performance Efficiency 7.5 11 11 11 11 Maintainability 15.8 24.5 29 23.5 25.6 Security 8.3 2 2 2 7.5 Compatibility 12.5 14 14 10 11 Accessibility 0 0 0 0 0.3 Portability 10.5 4.5 5.1 7.5 11.4 Deployment 1.5 0.6 1.2 3.6 3 Legal &amp; Compliance 19 16 16 16 19 Total 136.3 120.1 120.7 108.2 140.6"},{"location":"projects/tad/existing-tools/comparison/tools_comparison/#notable-differences-between-the-tools","title":"Notable differences between the tools","text":"<p>AIVerify notes:</p> <ul> <li> <p>Technical tests are supported, but it can be quite slow because of overhead of the tool</p> </li> <li> <p>More flexibility would need to be built in before people could use the technical tests</p> <ul> <li> <p>If you have many variables you are not able to show it in the pdf</p> </li> <li> <p>The error messages in why technical tests don't work on the model are not user-friendly</p> </li> </ul> </li> </ul> <p>VerifyML notes:</p> <ul> <li> <p>This tool is not actively developed anymore, parties transferred their focus to AIVerify</p> </li> <li> <p>This tool does not support for assessments</p> </li> </ul> <p>IBM 360 toolkit notes:</p> <ul> <li> <p>The toolkit has a strong backing of the industry and the community</p> </li> <li> <p>There are many technical tests included from the latest research, and also supports mitigation algorithms</p> </li> <li> <p>It is purely for developers and has therefore no support for assessments</p> </li> </ul> <p>Holistic AI:</p> <ul> <li> <p>Like IBM 360 Toolkit it does differentiate to different type of technical assessments like bias and explainability, but it is less extensive than the 360 toolkit</p> </li> <li> <p>The ambition is large of Holistic AI, they want to capture, Efficacy, Robustness, and Privacy tests as well</p> </li> <li> <p>It is a private company from the United Kingdom which has open sourced part of their tool</p> </li> </ul> <p>AI Assessment Tool:</p> <ul> <li> <p>This tool does not have any technical tests, but outshines the others with the discussion on assessment option</p> </li> <li> <p>It is also very performant</p> </li> </ul>"},{"location":"projects/tad/existing-tools/comparison/tools_comparison/#summary-per-tool-in-one-sentence","title":"Summary per tool in one sentence","text":"<ul> <li> <p><code>AIVerify</code> is a tool with a UI to execute both assessments and technical tests.</p> </li> <li> <p><code>VerifyML</code> is a Python package to generate Model Cards.</p> </li> <li> <p><code>Holistic AI</code> is a Python package to test for and mitigate Bias in your model.</p> </li> <li> <p><code>IBM 360 Research Toolkit</code> is a Python and R package to test for Fairness &amp; Explainability of your model.</p> </li> <li> <p><code>AI Assessment Tool</code> is a tool with a UI to execute assessments and log discussions.</p> </li> </ul>"},{"location":"projects/tad/reporting-standard/","title":"0.1a7 (Latest)","text":"<p>This document describes the Transparency of Algorithmic Decision making (TAD) Reporting Standard.</p> <p>For reproducibility, governance, auditing and sharing of algorithmic systems it is essential to have a reporting standard so that information about an algorithmic system can be shared. This reporting standard describes how information about the different phases of an algorithm's life cycle can be reported. It contains, among other things, descriptive information combined with information about the technical tests and assessments applied.</p> <p>Disclaimer</p> <p>The TAD Reporting Standard is work in progress. This means that the current standard is probably suboptimal and will change significantly in future versions.</p>"},{"location":"projects/tad/reporting-standard/#introduction","title":"Introduction","text":"<p>Inspired by Model Cards for Model Reporting and Papers with Code Model Index this standard almost<sup>1</sup> <sup>2</sup> <sup>3</sup> <sup>4</sup> extends the Hugging Face model card metadata specification to allow for:</p> <ol> <li>More fine-grained information on performance metrics, by extending the <code>metrics_field</code> from the Hugging    Face metadata specification.</li> <li>Capturing additional measurements on fairness and bias, which can be partitioned into bar plot like    measurements (such as mean absolute SHAP values) and graph plot like measurements (such as partial dependence). This    is achieved by defining a new field <code>measurements</code>.</li> <li>Capturing assessments (such as    IAMA    and ALTAI).    This is achieved by defining a new field <code>assessments</code>.</li> </ol> <p>Following Hugging Face, this proposed standard will be written in YAML.</p> <p>This standard does not contain all fields present in the Hugging Face metadata specification. The fields that are optional in the Hugging Face specification and are specific to the Hugging Face interface are omitted.</p> <p>Another difference is that we divide our implementation into three separate parts.</p> <ol> <li><code>system_card</code>, containing information about a group of ML-models which accomplish a specific task.</li> <li><code>model_card</code>, containing information about a specific data science model.</li> <li><code>assessment_card</code>, containing information about a regulatory assessment.</li> </ol> <p>Include statements</p> <p>These <code>model_card</code>s and  <code>assessment_card</code>s  can be included verbatim into a <code>system_card</code>, or referenced with an <code>!include</code> statement, allowing for minimal cards to be compact in a single file. Extensive cards can be split up for readability and maintainability. Our standard allows for the <code>!include</code> to be used anywhere.</p>"},{"location":"projects/tad/reporting-standard/#specification-of-the-standard","title":"Specification of the standard","text":"<p>The standard will be written in YAML. Example YAML files are given in the next section. The standard defines three cards: a <code>system_card</code>, a <code>model_card</code> and an <code>assessment_card</code>. A <code>system_card</code> contains information about an algorithmic system. It can have multiple models and each of these models should have a <code>model_card</code>. Regulatory assessments can be processed in an <code>assessment_card</code>. Note that <code>model_card</code>'s and <code>assessment_card</code>'s can be included directly into the <code>system_card</code> or can be included as separate YAML files with help of a YAML-include mechanism. For clarity the latter is preferred and is also used in the examples in the next section.</p>"},{"location":"projects/tad/reporting-standard/#system_card","title":"<code>system_card</code>","text":"<p>A <code>system_card</code> contains the following information.</p> <ol> <li><code>schema_version</code> (REQUIRED, string). Version of the schema used, for example \"0.1a2\".</li> <li> <p><code>provenance</code> (OPTIONAL). In case this System Card is generated from another source file, this field can capture the    historical context of the contents of this System Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used     to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.     Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in     ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>name</code> (OPTIONAL, string). Name used to describe the system.</p> </li> <li><code>upl</code> (OPTIONAL, string). If this algorithm is part of a product offered by the Dutch Government,     it should contain a URI from the     Uniform Product List.</li> <li> <p><code>owners</code> (OPTIONAL, list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the    Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this    field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is    set.</li> </ol> </li> <li> <p><code>description</code> (OPTIONAL, string). A short description of the system.</p> </li> <li> <p><code>labels</code> (OPTIONAL, list). This fields allows to store meta information about a system. There can be multiple labels.    For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the label.</li> <li><code>value</code> (OPTIONAL, string). Value of the label.</li> </ol> </li> <li> <p><code>status</code> (OPTIONAL, string). The status of the system. For example the status can be \"production\".</p> </li> <li><code>publication_category</code> (OPTIONAL, enum[string]). The publication category of the algorithm should be chosen from    <code>[\"high_risk\", other\"]</code>.</li> <li><code>begin_date</code> (OPTIONAL, string). The first date the system was used. Date should be given in     ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>end_date</code> (OPTIONAL, string). The last date the system was used. Date should be given in     ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>goal_and_impact</code> (OPTIONAL, string). The purpose of the system and the impact it has on citizens and companies.</li> <li><code>considerations</code> (OPTIONAL, string). The pro's and con's of using the system.</li> <li><code>risk_management</code> (OPTIONAL, string). Description of the risks associated with the system.</li> <li><code>human_intervention</code> (OPTIONAL, string). A description to want extend there is human involvement in the system.</li> <li> <p><code>legal_base</code> (OPTIONAL, list). If there exists a legal base for the process the system is embedded in, this field     can be filled in with the relevant laws. There can be multiple legal bases. For each legal base the following fields     are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the law.</li> <li><code>link</code> (OPTIONAL, string). URI pointing towards the contents of the law.</li> </ol> </li> <li> <p><code>used_data</code> (OPTIONAL, string). An overview of the data that is used in the system.</p> </li> <li><code>technical_design</code> (OPTIONAL, string). Description on how the system works.</li> <li> <p><code>external_providers</code> (OPTIONAL, list). If relevant, these fields allow to store information on external providers.     There can be multiple external providers.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the external provider.</li> <li><code>version</code> (OPTIONAL, string). Version of the external provider reflecting its relation to previous versions.</li> </ol> </li> <li> <p><code>references</code> (OPTIONAL, list[string]). Additional reference URI's that point information about the system and are     relevant.</p> </li> <li><code>interaction_details</code> (OPTIONAL, list[string]). Explain how the AI system interacts with hardware or software,     including other AI systems, or how the AI system can be used to interact with hardware or software.</li> <li><code>version_requirements</code> (OPTIONAL, list[string]). Describe the versions of the relevant software or firmware, and any     requirements related to version updates.</li> <li><code>deployment_variants</code> (OPTIONAL, list[string]). Description of all the forms in which the AI system is placed on the     market or put into service, such as software packages embedded into hardware, downloads, or APIs.</li> <li><code>hardware_requirements</code> (OPTIONAL, list[string]). Provide a description of the hardware on which the AI system must     be run.</li> <li><code>product_markings</code> (OPTIONAL, list[string]). If the AI system is a component of products, photos, or illustrations,     describe the external features, markings, and internal layout of those products.</li> <li> <p><code>user_interface</code> (OPTIONAL, list). Provide information on the user interface provided to the user responsible for     its operation.</p> <ol> <li><code>description</code> (OPTIONAL, string). A description of the provided user interface.</li> <li><code>link</code> (OPTIONAL, string). A link to the user interface can be included.</li> <li><code>snapshot</code> (OPTIONAL, string). A snapshot/screenshot of the user interface can be included with the use of a     hyperlink.</li> </ol> </li> <li> <p><code>models</code> (OPTIONAL, list[ModelCard]). A list of model cards (as defined below) or <code>!include</code>s of a YAML file     containing a model card. This model card can for example be a model card described in the next section or a model     card from Hugging Face. There can be multiple model cards, meaning multiple models are used.</p> </li> <li> <p><code>assessments</code> (OPTIONAL, list[AssessmentCard]). A list of assessment cards (as defined below) or <code>!include</code>s of a     YAML file containing a assessment card. This assessment card is an assessment card described in the next section.     There can be multiple assessment cards, meaning multiple assessment were performed.</p> </li> </ol>"},{"location":"projects/tad/reporting-standard/#model_card","title":"<code>model_card</code>","text":"<p>A <code>model_card</code> contains the following information.</p> <ol> <li> <p><code>provenance</code> (OPTIONAL). In case this Model Card is generated from another source file, this field can capture the    historical context of the contents of this Model Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used     to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.    Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in    ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>language</code> (OPTIONAL, list[string]). If relevant, the natural languages the model supports in    ISO 639. There can be multiple languages.</p> </li> <li> <p><code>license</code> (REQUIRED).</p> <ol> <li><code>license_name</code> (REQUIRED, string). Any license from the    open source license list<sup>1</sup>. If the license is NOT present in the license list    this field must be set to 'other' and the following two fields will be REQUIRED.</li> <li><code>license_link</code> (OPTIONAL, string). A link to a file of that name inside the repo, or a URL to a remote file    containing the license contents.</li> </ol> </li> <li> <p><code>tags</code> (OPTIONAL, list[string]). Tags with keywords to describe the project. There can be multiple tags.</p> </li> <li> <p><code>owners</code> (OPTIONAL, list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the    Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this    field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is     set.</li> </ol> </li> <li> <p><code>model_index</code> (REQUIRED, list). There can be multiple models. For each model the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the model.</li> <li><code>model</code> (REQUIRED, string). A URI pointing to a repository containing the model file.</li> <li> <p><code>artifacts</code> (OPTIONAL, list). A list of artifacts</p> <ol> <li><code>uri</code> (OPTIONAL, string) URI refers to a relevant model artifact</li> <li><code>content-type</code> (OPTIONAL, string) Optional type, follow the    Content-Type. Recognized values are    \"application/onnx\"\", to refer to an ONNX representation of the model.</li> <li><code>md5-checksum</code> (OPTIONAL, string) Optional checksum for the content of the file.</li> </ol> </li> <li> <p><code>parameters</code> (OPTIONAL, list). There can be multiple parameters. For each parameter the following fields are    present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the parameter, for example \"epochs\".</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the parameter, for example \"int\".</li> <li><code>value</code> (OPTIONAL, string). The value of the parameter, for example 100.</li> <li> <p><code>labels</code> (OPTIONAL, list). This field allows to store meta information about a parameter. There can be    multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the label.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED.</li> </ol> </li> </ol> </li> <li> <p><code>results</code> (OPTIONAL, list). There can be multiple results. For each result the following fields are present.</p> <ol> <li> <p><code>task</code> (OPTIONAL, list).</p> <ol> <li><code>task_type</code> (REQUIRED, string). The task of the model, for example \"object-classification\".</li> <li><code>task_name</code> (OPTIONAL, string). A pretty name for the model tasks, for example \"Object Classification\".</li> </ol> </li> <li> <p><code>datasets</code> (OPTIONAL, list). There can be multiple datasets <sup>2</sup>. For each dataset the following fields are    present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the dataset, can be a dataset id from    Hugging Face datasets or any other link to a repository containing the    dataset<sup>3</sup>, for example \"common_voice\".</li> <li><code>name</code> (REQUIRED, string). Name pretty name for the dataset, for example \"Common Voice (French)\".</li> <li><code>split</code> (OPTIONAL, string). The split of the dataset, for example \"train\".</li> <li><code>features</code> (OPTIONAL, list[string]). List of feature names.</li> <li><code>revision</code> (OPTIONAL, string). Version of the dataset, for example    \"5503434ddd753f426f4b38109466949a1217c2bb\".</li> </ol> </li> <li> <p><code>metrics</code> (OPTIONAL, list). There can be multiple metrics. For each metric the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). A metric-id from Hugging Face metrics<sup>4</sup>, for    example accuracy.</li> <li><code>name</code> (REQUIRED, string). A descriptive name of the metric. For example \"false positive rate\" is not a    descriptive name, but \"training false positive rate w.r.t class x\" is.</li> <li><code>dtype</code> (REQUIRED, string). The data type of the metric, for example <code>float</code>.</li> <li><code>value</code> (REQUIRED, string). The value of the metric.</li> <li> <p><code>labels</code> (OPTIONAL, list). This field allows to store meta information about a metric. For example,    metrics can be computed for example on subgroups of specific features. For example, one can compute the    accuracy for examples where the feature \"gender\" is set to \"male\". There can be multiple subgroups, which    means that the metric is computed on the intersection of those subgroups. There can be multiple labels.    For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the feature. For example: \"gender\".</li> <li><code>type</code> (OPTIONAL, string). The type of the label. Can for example be set to \"feature\" or    \"output_class\". If <code>name</code> is set, this field is REQUIRED.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature, for example <code>float</code>. If <code>name</code> is set, this    field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED. For    example: \"male\".</li> </ol> </li> </ol> </li> <li> <p><code>measurements</code>.</p> <ol> <li> <p><code>bar_plots</code> (OPTIONAL, list). The purpose of this field is to capture bar plot like measurements, for     example SHAP values. There can be multiple bar plots. For each bar plot the following fields are     present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of bar plot, for example \"SHAP\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name for the plot, for example \"Mean Absolute SHAP Values\".</li> <li> <p><code>results</code> (REQUIRED, list). The contents of the bar plot. A result represents a bar. There can be    multiple results. For each result the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of bar.</li> <li><code>value</code> (REQUIRED, float). The value of the corresponding bar.</li> </ol> </li> </ol> </li> <li> <p><code>graph_plots</code> (OPTIONAL, list). The purpose of this field is to capture graph plot like measurements,    such as partial dependence plots. There can be multiple graph plots. For each graph plot the following    fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the graph plot, for example \"partial_dependence\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name of the graph, for example \"Partial Dependence Plot\".</li> <li> <p><code>results</code> (REQUIRED, list). Results contains the graph plot data. Each graph can depend on a specific    output class and feature. There can be multiple results. For each result the following fields are    present.</p> <ol> <li><code>class</code> (OPTIONAL, string/int/float/bool). The output class name that the graph corresponds to.    This field is not always present.</li> <li><code>feature</code> (REQUIRED, string). The feature the graph corresponds to. This is required, since all    relevant graphs are dependent on features.</li> <li> <p><code>data</code> (REQUIRED, list)</p> <ol> <li><code>x_value</code> (REQUIRED, float). The $x$-value of the graph.</li> <li><code>y_value</code> (REQUIRED, float). The $y$-value of the graph.</li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/#assessment_card","title":"<code>assessment_card</code>","text":"<p>An <code>assessment_card</code> contains the following information.</p> <ol> <li> <p><code>provenance</code> (OPTIONAL). In case this Assessment Card is generated from another source file, this field can capture    the historical context of the contents of this Assessment Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used    to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.    Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in    ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>name</code> (REQUIRED, string). The name of the assessment.</p> </li> <li><code>urn</code> (OPTIONAL, string). A Uniform Resource Name (URN) of the instrument in the instrument register.</li> <li><code>date</code> (REQUIRED, string). The date at which the assessment is completed. Date should be given in    ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li> <p><code>contents</code> (REQUIRED, list). There can be multiple items in contents. For each item the following fields are present:</p> <ol> <li><code>question</code> (REQUIRED, string). A question.</li> <li><code>urn</code> (OPTIONAL, string). A Uniform Resource Name (URN) of the corresponding task in the instrument register.</li> <li><code>answer</code> (REQUIRED, string). An answer.</li> <li><code>remarks</code> (OPTIONAL, string). A field to put relevant discussion remarks in.</li> <li> <p><code>authors</code> (OPTIONAL, list). There can be multiple names. For each name the following field is present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the author of the question.</li> </ol> </li> <li> <p><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of the answer. Timestamp should be     given, preferably in UTC (represented as <code>Z</code>), in     ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</p> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/#example","title":"Example","text":""},{"location":"projects/tad/reporting-standard/#system-card","title":"System Card","text":"<pre><code>version: {system_card_version}\nprovenance:\n  git_commit_hash: {git_commit_hash}\n  timestamp: {modification_timestamp}\n  uri: {modification_uri}\n  author: {modification_author}\nname: {system_name}\nupl: {upl_uri}\nowners:\n  - oin: {oin}\n    organization: {organization_name}\n    name: {owner_name}\n    email: {owner_email}\n    role: {owner_role}\ndescription: {system_description}\nlabels:\n  - name: {label_name}\n    value: {label_value}\nstatus: {system_status}\npublication_category: {system_publication_cat}\nbegin_date: {system_begin_date}\nend_date: {system_end_date}\ngoal_and_impact: {system_goal_and_impact}\nconsiderations: {system_considerations}\nrisk_management: {system_risk_management}\nhuman_intervention: {system_human_intervention}\nlegal_base:\n  - name: {law_name}\n    link: {law_uri}\nused_data: {system_used_data}\ntechnical_design: {technical_design}\nexternal_providers:\n  - name: {name_external_provider}\n    version: {version_external_provider}\nreferences:\n  - {reference_uri}\ninteraction_details:\n  - {system_interaction_details}\nversion_requirements:\n  - {system_version_requirements}\ndeployment_variants:\n  - {system_deployment_variants}\nhardware_requirements:\n  - {system_hardware_requirements}\nproduct_markings:\n  - {system_product_markings}\nuser_interface:\n  - description: {system_user_interface}\n    link: {system_user_interface_uri}\n    snapshot: {system_user_interface_snapshot_uri}\n\nmodels:\n  - !include {model_card_uri}\n\nassessments:\n  - !include {assessment_card_uri}\n</code></pre>"},{"location":"projects/tad/reporting-standard/#model-card","title":"Model Card","text":"<pre><code>provenance:\n  git_commit_hash: {git_commit_hash}\n  timestamp: {modification_timestamp}\n  uri: {modification_uri}\n  author: {modification_author}\nlanguage:\n  - {lang_0}\nlicense:\n  license_name: {license_name}\n  license_link: {license_uri}\ntags:\n  - {tag_0}\nowners:\n  - oin: {oin}\n    organization: {organization_name}\n    name: {owner_name}\n    email: {owner_email}\n    role: {owner_role}\n\nmodel-index:\n  - name: {model_id}\n    model: {model_uri}\n    artifacts:\n      - uri: {model_artifact_uri}\n      - content-type: {model_artifact_type}\n      - md5-checksum: {md5_checksum}\n    parameters:\n      - name: {parameter_name}\n        dtype: {parameter_dtype}\n        value: {parameter_value}\n        labels:\n          - name: {label_name}\n            dtype: {label_type}\n            value: {label_value}\n    results:\n      - task:\n          - type: {task_type}\n            name: {task_name}\n        datasets:\n          - type: {dataset_type}\n            name: {dataset_name}\n            split: {split}\n            features:\n              - {feature_name}\n            revision: {dataset_version}\n        metrics:\n          - type: {metric_type}\n            name: {metric_name}\n            dtype: {metric_dtype}\n            value: {metric_value}\n            labels:\n              - name: {label_name}\n                type: {label_type}\n                dtype: {label_type}\n                value: {label_value}\n        measurements:\n          bar_plots:\n            - type: {measurement_type}\n              name: {measurement_name}\n              results:\n                - name: {bar_name}\n                  value: {bar_value}\n          graph_plots:\n            - type: {measurement_type}\n              name: {measurement_name}\n              results:\n                - class: {class_name}\n                  feature: {feature_name}\n                  data:\n                    - x_value: {x_value}\n                      y_value: {y_value}\n</code></pre>"},{"location":"projects/tad/reporting-standard/#assessment-card","title":"Assessment Card","text":"<pre><code>provenance:\n  git_commit_hash: {git_commit_hash}\n  timestamp: {modification_timestamp}\n  uri: {modification_uri}\n  author: {modification_author}\nname: {assessment_name}\nurn: {urn}\ndate: {assessment_date}\ncontents:\n  - question: {question_text}\n    urn: {urn}\n    answer: {answer_text}\n    remarks: {remarks_text}\n    authors:\n      - name: {author_name}\n    timestamp: {timestamp}\n</code></pre>"},{"location":"projects/tad/reporting-standard/#schema","title":"Schema","text":"<p>JSON schema will be added when we publish the first beta version.</p>"},{"location":"projects/tad/reporting-standard/#changelog","title":"Changelog","text":"<ul> <li>0.1a7: adds urn to assessment card</li> <li>0.1a6:<ul> <li>fix mismatches between description and examples</li> <li>format YAML examples and Markdown formatting</li> </ul> </li> <li>0.1a5: adds a general description of the technical documentation required for high-risk systems to conform to the EU   AI Act.</li> <li>0.1a4: adds data provenance</li> <li>0.1a3: require ISO 8601 timestamp</li> <li>0.1a2: introduces typed artifacts</li> <li>0.1a1: initial draft version of this reporting standard</li> </ul> <ol> <li> <p>Deviation from the Hugging Face specification is in the License field. Hugging Face only accepts dataset id's   from Hugging Face license list while we accept any   license from Open Source License List.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the <code>model_index:results:dataset</code> field. Hugging Face only   accepts one dataset, while we accept a list of datasets.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the Dataset Type field. Hugging Face only accepts dataset id's   from Hugging Face datasets while we also allow for any url pointing to the dataset.\u00a0\u21a9\u21a9</p> </li> <li> <p>For this extension to work relevant metrics (such as for example false positive rate) have to be added to the   Hugging Face metrics, possibly this can be done in our organizational namespace.\u00a0\u21a9\u21a9</p> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a1/","title":"0.1a1","text":"<p>This document describes the Transparency of Algorithmic Decision making (TAD) Reporting Standard.</p> <p>For reproducibility, governance, auditing and sharing of algorithmic systems it is essential to have a reporting standard so that information about an algorithmic system can be shared. This reporting standard describes how information about the different phases of an algorithm's life cycle can be reported. It contains, among other things, descriptive information combined with information about the technical tests and assessments applied.</p> <p>Disclaimer</p> <p>The TAD Reporting Standard is work in progress. This means that the current standard is probably suboptimal and will change significantly in future versions.</p>"},{"location":"projects/tad/reporting-standard/0.1a1/#introduction","title":"Introduction","text":"<p>Inspired by Model Cards for Model Reporting and Papers with Code Model Index this standard almost <sup>1</sup> <sup>2</sup> <sup>3</sup> <sup>4</sup> extends the Hugging Face model card metadata specification to allow for:</p> <ol> <li>More fine-grained information on performance metrics, by extending the <code>metrics_field</code> from the Hugging Face metadata specification.</li> <li>Capturing additional measurements on fairness and bias, which can be partitioned into bar plot like measurements (such as mean absolute SHAP values) and graph plot like measurements (such as partial dependence). This is achieved by defining a new field <code>measurements</code>.</li> <li>Capturing assessments (such as IAMA and ALTAI). This is achieved by defining a new field <code>assessments</code>.</li> </ol> <p>Following Hugging Face, this proposed standard will be written in yaml.</p> <p>This standard does not contain all fields present in the Hugging Face metadata specification. The fields that are optional in the Hugging Face specification and are specific to the Hugging Face interface are omitted.</p> <p>Another difference is that we divide our implementation into three separate parts.</p> <ol> <li><code>system_card</code>, containing information about a group of ML-models which accomplish a specific task.</li> <li><code>model_card</code>, containing information about a specific data science model.</li> <li><code>assessment_card</code>, containing information about a regulatory assessment.</li> </ol> <p>Include statements</p> <p>These <code>model_card</code>s and  <code>assessment_card</code>s  can be included verbatim into a <code>system_card</code>, or referenced with an <code>!include</code> statement, allowing for minimal cards to be compact in a single file. Extensive cards can be split up for readability and maintainability. Our standard allows for the <code>!include</code> to be used anywhere.</p>"},{"location":"projects/tad/reporting-standard/0.1a1/#specification-of-the-standard","title":"Specification of the standard","text":"<p>The standard will be written in yaml. Example yaml files are given in the next section. The standard defines three cards: a <code>system_card</code>, a <code>model_card</code> and an <code>assessment_card</code>. A <code>system_card</code> contains information about an algorithmic system. It can have multiple models and each of these models should have a <code>model_card</code>. Regulatory assessments can be processed in an <code>assessment_card</code>. Note that <code>model_card</code>'s and <code>assessment_card</code>'s can be included directly into the <code>system_card</code> or can be included as separate yaml files with help of a yaml-include mechanism. For clarity the latter is preferred and is also used in the examples in the next section.</p>"},{"location":"projects/tad/reporting-standard/0.1a1/#system_card","title":"<code>system_card</code>","text":"<p>A <code>system_card</code> contains the following information.</p> <ol> <li><code>schema_version</code> (REQUIRED, string). Version of the schema used, for example \"0.1a1\".</li> <li><code>name</code> (OPTIONAL, string). Name used to describe the system.</li> <li><code>upl</code> (OPTIONAL, string). If this algorithm is part of a product offered by the Dutch Government,     it should contain a URI from the Uniform Product List.</li> <li> <p><code>owners</code> (list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is set.</li> </ol> </li> <li> <p><code>description</code> (OPTIONAL, string). A short description of the system.</p> </li> <li> <p><code>labels</code> (OPTIONAL, list). This fields allows to store meta information about a system. There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the label.</li> <li><code>value</code> (OPTIONAL, string). Value of the label.</li> </ol> </li> <li> <p><code>status</code> (OPTIONAL, string). The status of the system. For example the status can be \"production\".</p> </li> <li><code>publication_category</code> (OPTIONAL, enum[string]). The publication category of the algorithm should be chosen from <code>[\"high_risk\", other\"]</code>.</li> <li><code>begin_date</code> (OPTIONAL, string). The first date the system was used. Date should be given in ISO 8601 format, i.e. YYYY-MM-DD.</li> <li><code>end_date</code> (OPTIONAL, string). The last date the system was used. Date should be given in ISO 8601 format, i.e. YYYY-MM-DD.</li> <li><code>goal_and_impact</code> (OPTIONAL, string). The purpose of the system and the impact it has on citizens and companies.</li> <li><code>considerations</code> (OPTIONAL, string). The pro's and con's of using the system.</li> <li><code>risk_management</code> (OPTIONAL, string). Description of the risks associated with the system.</li> <li><code>human_intervention</code> (OPTIONAL, string). A description to want extend there is human involvement in the system.</li> <li><code>legal_base</code> (OPTIONAL, list). If there exists a legal base for the process the system is embedded in, this field can be filled in with the relevant laws. There can be multiple legal bases. For each legal base the following fields are present.<ol> <li><code>name</code> (OPTIONAL, string). Name of the law.</li> <li><code>link</code> (OPTIONAL, string). URI pointing towards the contents of the law.</li> </ol> </li> <li><code>used_data</code> (OPTIONAL, string). An overview of the data that is used in the system.</li> <li><code>technical_design</code> (OPTIONAL, string). Description on how the system works.</li> <li><code>external_providers</code> (OPTIONAL, list[string]). Name of an external provider, if relevant. There can be multiple external providers.</li> <li><code>references</code> (OPTIONAL, list[string]). Additional reference URI's that point information about the system and are relevant.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a1/#1-models","title":"1. Models","text":"<ol> <li><code>models</code> (OPTIONAL, list[ModelCard]). A list of model cards (as defined below) or <code>!include</code>s of a yaml file containing a model card. This model card can for example be a model card described in the next section or a model card from Hugging Face. There can be multiple model cards, meaning multiple models are used.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a1/#2-assessments","title":"2. Assessments","text":"<ol> <li><code>assessments</code> (OPTIONAL, list[AssessmentCard]). A list of assessment cards (as defined below) or <code>!include</code>s of a yaml file containing a assessment card. This assessment card is an assessment card described in the next section. There can be multiple assessment cards, meaning multiple assessment were performed.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a1/#model_card","title":"<code>model_card</code>","text":"<p>A <code>model_card</code> contains the following information.</p> <ol> <li><code>language</code> (OPTIONAL, list[string]). If relevant, the natural languages the model supports in ISO 639.     There can be multiple languages.</li> <li> <p><code>license</code>(REQUIRED, string). Any license from the open source license list <sup>1</sup>. If the license is NOT present in the license list this field must be set to 'other' and the following two fields will be REQUIRED.</p> <ol> <li><code>license_name</code> (string). An id for the license.</li> <li><code>license_link</code> (string). A link to a file of that name inside the repo, or a URL to a remote file containing the license contents.</li> </ol> </li> <li> <p><code>tags</code> (OPTIONAL, list[string]). Tags with keywords to describe the project. There can be multiple tags.</p> </li> <li> <p><code>owners</code> (list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is set.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a1/#1-model-index","title":"1. Model Index","text":"<p>There can be multiple models. For each model the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the model.</li> <li><code>model</code> (REQUIRED, string). A URI pointing to a repository containing the model file.</li> <li><code>artifacts</code> (OPTIONAL, list[string]). A list of URI's where each URI refers to a relevant model artifact, that cannot be captured by any other field, but are relevant to model.</li> <li> <p><code>parameters</code> (list). There can be multiple parameters. For each parameter the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the parameter, for example \"epochs\".</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the parameter, for example \"int\".</li> <li><code>value</code> (OPTIONAL, string). The value of the parameter, for example 100.</li> <li> <p><code>labels</code> (list). This field allows to store meta information about a parameter.     There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the label.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED.</li> </ol> </li> </ol> </li> <li> <p><code>results</code> (list). There can be multiple results. For each result the following fields are present.</p> <ol> <li> <p><code>task</code> (OPTIONAL, list).</p> <ol> <li><code>task_type</code> (REQUIRED, string). The task of the model, for example \"object-classification\".</li> <li><code>task_name</code> (OPTIONAL, string). A pretty name for the model tasks, for example \"Object Classification\".</li> </ol> </li> <li> <p><code>datasets</code> (list). There can be multiple datasets <sup>2</sup>. For each dataset the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the dataset, can be a dataset id from Hugging Face datasets or any other link to a repository containing the dataset<sup>3</sup>, for example \"common_voice\".</li> <li><code>name</code> (REQUIRED, string). Name pretty name for the dataset, for example \"Common Voice (French)\".</li> <li><code>split</code> (OPTIONAL, string). The split of the dataset, for example \"train\".</li> <li><code>features</code> (OPTIONAL, list[string]). List of feature names.</li> <li><code>revision</code> (OPTIONAL, string). Version of the dataset, for example 5503434ddd753f426f4b38109466949a1217c2bb.</li> </ol> </li> <li> <p><code>metrics</code> (list). There can be multiple metrics. For each metric the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). A metric-id from Hugging Face metrics<sup>4</sup>, for example accuracy.</li> <li><code>name</code> (REQUIRED, string). A descriptive name of the metric. For example \"false positive rate\" is not a descriptive name, but \"training false positive rate w.r.t class x\" is.</li> <li><code>dtype</code> (REQUIRED, string). The data type of the metric, for example <code>float</code>.</li> <li><code>value</code> (REQUIRED, string). The value of the metric.</li> <li> <p><code>labels</code> (list). This field allows to store meta information about a metric. For example, metrics can be computed for example on subgroups of specific features. For example, one can compute the accuracy for examples where the feature \"gender\" is set to \"male\". There can be multiple subgroups, which means that the metric is computed on the intersection of those subgroups. There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the feature. For example: \"gender\".</li> <li><code>type</code> (OPTIONAL, string). The type of the label. Can for example be set to \"feature\" or \"output_class\". If <code>name</code> is set, this field is REQUIRED.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature, for example <code>float</code>. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED. For example: \"male\".</li> </ol> </li> </ol> </li> <li> <p><code>measurements</code>.</p> <ol> <li> <p><code>bar_plots</code> (list). The purpose of this field is to capture bar plot like measurements, for example SHAP values. There can be multiple bar plots. For each bar plot the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of bar plot, for example \"SHAP\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name for the plot, for example \"Mean Absolute SHAP Values\".</li> <li><code>results</code> (list). The contents of the bar plot. A result represents a bar. There can be multiple results. For each result the following fields are present.<ol> <li><code>name</code> (REQUIRED, string). The name of bar.</li> <li><code>value</code> (REQUIRED, float). The value of the corresponding bar.</li> </ol> </li> </ol> </li> <li> <p><code>graph_plots</code> (list). The purpose of this field is to capture graph plot like measurements, such as partial dependence plots. There can be multiple graph plots. For each graph plot the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the graph plot, for example \"partial_dependence\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name of the graph, for example \"Partial Dependence Plot\".</li> <li><code>results</code> (list). Results contains the graph plot data. Each graph can depend on a specific output class and feature. There can be multiple results. For each result the following fields are present.<ol> <li><code>class</code> (OPTIONAL, string/int/float/bool). The output class name that the graph corresponds to. This field is not always present.</li> <li><code>feature</code> (REQUIRED, string). The feature the graph corresponds to. This is required, since all relevant graphs are dependent on features.</li> <li><code>data</code> (list)<ol> <li><code>x_value</code> (REQUIRED, float). The $x$-value of the graph.</li> <li><code>y_value</code> (REQUIRED, float). The $y$-value of the graph.</li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a1/#assessment_card","title":"<code>assessment_card</code>","text":"<p>An <code>assessment_card</code> contains the following information.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the assessment.</li> <li><code>date</code> (REQUIRED, string). The date at which the assessment is completed. Date should be given in ISO 8601 format, i.e. YYYY-MM-DD.</li> <li> <p><code>contents</code> (list). There can be multiple items in contents. For each item the following fields are present:</p> <ol> <li><code>question</code> (REQUIRED, string). A question.</li> <li><code>answer</code> (REQUIRED, string). An answer.</li> <li><code>remarks</code> (OPTIONAL, string). A field to put relevant discussion remarks in.</li> <li><code>authors</code>. There can be multiple names. For each name the following field is present.<ol> <li><code>name</code> (OPTIONAL, string). The name of the author of the question.</li> </ol> </li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date and time of the answer.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a1/#example","title":"Example","text":""},{"location":"projects/tad/reporting-standard/0.1a1/#system-card","title":"System Card","text":"<pre><code>version: {system_card_version}                          # Optional. Example: \"0.1a1\"\nname: {system_name}                                     # Optional. Example: \"AangifteVertrekBuitenland\"\nupl: {upl_uri}                                          # Optional. Example: https://standaarden.overheid.nl/owms/terms/AangifteVertrekBuitenland\nowners:\n- oin: {oin}                                            # Optional. Example: 00000001003214345000\n  organization: {organization_name}                     # Optional if oin is provided, Required otherwise. Example: BZK\n  name: {owner_name}                                    # Optional. Example: John Doe\n  email: {owner_email}                                  # Optional. Example: johndoe@email.com\n  role: {owner_role}                                    # Optional. Example: Data Scientist.\ndescription: {system_description}                       # Optional. Short description of the system.\nlabels:                                                 # Optional. Labels to store metadata about the system.\n- name: {label_name}                                    # Optional.\n  value: {label_value}                                  # Optional.\nstatus: {system_status}                                 # Optional. Example \"production\".\npublication_category: {system_publication_cat}          # Optional. Example: \"impactful_algorithm\".\nbegin_date: {system_begin_date}                         # Optional. Example: 2025-1-1.\nend_date: {system_end_date}                             # Optional. Example: 2025-12-1.\ngoal_and_impact: {system_goal_and_impact}               # Optional. Goal and impact of the system.\nconsiderations: {system_considerations}                 # Optional. Considerations about the system.\nrisk_management: {system_risk_management}               # Optional. Description of risks associated with the system.\nhuman_intervention: {system_human_intervention}         # Optional. Description of human involvement in the system.\nlegal_base:\n- name: {law_name}                                      # Optional. Example: \"AVG\".\n  link: {law_uri}                                       # Optional. Example: \"https://eur-lex.europa.eu/legal-content/NL/TXT/HTML/?uri=CELEX:31995L0046\".\nused_data: {system_used_data}                           # Optional. Description of the data used by the system.\ntechnical_design: {technical_design}                    # Optional. Description of the technical design of the system.\nexternal_providers:\n- {system_external_provider}                            # Optional. Reference to used external providers.\nreferences:\n- {reference_uri}                                       # Optional. Example: URI to codebase.\n\nmodels:\n- !include {model_card_uri}                             # Optional. Example: cat_classifier_model.yaml.\n\nassessments:\n- !include {assessment_card_uri}                        # Required. Example: iama.yaml.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a1/#model-card","title":"Model Card","text":"<pre><code>language:\n  - {lang_0}                                            # Optional. Example nl.\nlicense: {license}                                      # Required. Example: Apache-2.0 or any license SPDX ID from https://opensource.org/license or \"other\".\nlicense_name: {license_name}                            # Optional if license != other, Required otherwise. Example: 'my-license-1.0'\nlicense_link: {license_link}                            # Optional if license != other, Required otherwise. Specify \"LICENSE\" or \"LICENSE.md\" to link to a file of that name inside the repo, or a URL to a remote file.\ntags:\n- {tag_0}                                               # Optional. Example: audio\n- {tag_1}                                               # Optional. Example: automatic-speech-recognition\nowners:\n- organization: {organization_name}                     # Required. Example: BZK\n  oin: {oin}                                            # Optional. Example: 00000001003214345000\n  name: {owner_name}                                    # Optional. Example: John Doe\n  email: {owner_email}                                  # Optional. Example: johndoe@email.com\n  role: {owner_role}                                    # Optional. Example: Data Scientist.\n\nmodel-index:\n- name: {model_id}                                      # Required. Example: CatClassifier.\n  model: {model_uri}                                    # Required. URI to a repository containing the model file.\n  artifacts:\n  - {model_artifact}                                    # Optional. URI to relevant model artifacts, if applicable.\n  parameters:\n  - name: {parameter_name}                              # Optional. Example: \"epochs\".\n    dtype: {parameter_dtype}                            # Optional. Example: \"int\".\n    value: {parameter_value}                            # Optional. Example: 100.\n    labels:\n      - name: {label_name}                              # Optional. Example: \"gender\".\n        dtype: {label_type}                             # Optional. Example: \"string\".\n        value: {label_value}                            # Optional. Example: \"female\".\n  results:\n  - task:\n      type: {task_type}                                 # Required. Example: image-classification.\n      name: {task_name}                                 # Optional. Example: Image Classification.\n    datasets:\n      - type: {dataset_type}                            # Required. Example: common_voice. Link to a repository containing the dataset\n        name: {dataset_name}                            # Required. Example: \"Common Voice (French)\". A pretty name for the dataset.\n        split: {split}                                  # Optional. Example: \"train\".\n        features:\n         - {feature_name}                               # Optional. Example: \"gender\".\n        revision: {dataset_version}                     # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n    metrics:\n    - type: {metric_type}                               # Required. Example: false-positive-rate. Use metric id from https://hf.co/metrics.\n      name: {metric_name}                               # Required. Example: \"FPR wrt class 0 restricted to feature gender:0 and age:21\".\n      dtype: {metric_dtype}                             # Required. Example: \"float\".\n      value: {metric_value}                             # Required. Example: 0.75.\n      labels:\n        - name: {label_name}                            # Optional. Example: \"gender\".\n          type: {label_type}                            # Optional. Example: \"feature\".\n          dtype: {label_type}                           # Optional. Example: \"string\".\n          value: {label_value}                          # Optional. Example: \"female\".\n    measurements:\n      # Bar plots should be able to capture SHAP and Robustness Toolbox from AI Verify.\n      bar_plots:\n      - type: {measurement_type}                        # Required. Example: \"SHAP\".\n        name: {measurement_name}                        # Optional. Example: \"Mean Absolute Shap Values\".\n        results:\n        - name: {bar_name}                              # Required. The name of a bar.\n          value: {bar_value}                            # Required. The corresponding value.\n      # Graph plots should be able to capture graph based measurements such as partial dependence and accumulated local effect.\n      graph_plots:\n      - type: {measurement_type}                        # Required. Example: \"partial_dependence\".\n        name: {measurement_name}                        # Optional. Example: \"Partial Dependence Plot\".\n        # Results store the graph plot data. So far all plots are dependent on a combination of a specific class (sometimes) and feature (always).\n        # For example partial dependence plots are made for each feature and class.\n        results:\n         - class: {class_name}                          # Optional. Name of the output class the graph depends on.\n           feature: {feature_name}                      # Required. Name of the feature the graph depends on.\n           data:\n            - x_value: {x_value}                        # Required. The x value of the graph data.\n              y_value: {y_value}                        # Required. The y value of the graph data.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a1/#assessment-card","title":"Assessment Card","text":"<pre><code>name: {assessment_name}                               # Required. Example: IAMA.\ndate: {assessment_date}                               # Required. Example: 25-03-2025.\ncontents:\n  - question: {question_text}                         # Required. Example: \"Question 1: ...\".\n    answer: {answer_text}                             # Required. Example: \"Answer: ...\".\n    remarks: {remarks_text}                           # Optional. Example: \"Remarks: ...\".\n    authors:                                          # Optional. Example: \"['John', 'Peter']\".\n      - name: {author_name}\n    timestamp: {timestamp}                            # Optional. Example: 1711630721.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a1/#schema","title":"Schema","text":"<p>JSON schema will be added when we publish the first beta version.</p> <ol> <li> <p>Deviation from the Hugging Face specification is in the License field. Hugging Face only accepts dataset id's from Hugging Face license list while we accept any license from Open Source License List.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the <code>model_index:results:dataset</code> field. Hugging Face only accepts one dataset, while we accept a list of datasets.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the Dataset Type field. Hugging Face only accepts dataset id's from Hugging Face datasets while we also allow for any url pointing to the dataset.\u00a0\u21a9\u21a9</p> </li> <li> <p>For this extension to work relevant metrics (such as for example false positive rate) have to be added to the Hugging Face metrics, possibly this can be done in our organizational namespace.\u00a0\u21a9\u21a9</p> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a2/","title":"0.1a2","text":"<p>This document describes the Transparency of Algorithmic Decision making (TAD) Reporting Standard.</p> <p>For reproducibility, governance, auditing and sharing of algorithmic systems it is essential to have a reporting standard so that information about an algorithmic system can be shared. This reporting standard describes how information about the different phases of an algorithm's life cycle can be reported. It contains, among other things, descriptive information combined with information about the technical tests and assessments applied.</p> <p>Disclaimer</p> <p>The TAD Reporting Standard is work in progress. This means that the current standard is probably suboptimal and will change significantly in future versions.</p>"},{"location":"projects/tad/reporting-standard/0.1a2/#introduction","title":"Introduction","text":"<p>Inspired by Model Cards for Model Reporting and Papers with Code Model Index this standard almost <sup>1</sup> <sup>2</sup> <sup>3</sup> <sup>4</sup> extends the Hugging Face model card metadata specification to allow for:</p> <ol> <li>More fine-grained information on performance metrics, by extending the <code>metrics_field</code> from the Hugging Face metadata specification.</li> <li>Capturing additional measurements on fairness and bias, which can be partitioned into bar plot like measurements (such as mean absolute SHAP values) and graph plot like measurements (such as partial dependence). This is achieved by defining a new field <code>measurements</code>.</li> <li>Capturing assessments (such as IAMA and ALTAI). This is achieved by defining a new field <code>assessments</code>.</li> </ol> <p>Following Hugging Face, this proposed standard will be written in yaml.</p> <p>This standard does not contain all fields present in the Hugging Face metadata specification. The fields that are optional in the Hugging Face specification and are specific to the Hugging Face interface are omitted.</p> <p>Another difference is that we divide our implementation into three separate parts.</p> <ol> <li><code>system_card</code>, containing information about a group of ML-models which accomplish a specific task.</li> <li><code>model_card</code>, containing information about a specific data science model.</li> <li><code>assessment_card</code>, containing information about a regulatory assessment.</li> </ol> <p>Include statements</p> <p>These <code>model_card</code>s and  <code>assessment_card</code>s  can be included verbatim into a <code>system_card</code>, or referenced with an <code>!include</code> statement, allowing for minimal cards to be compact in a single file. Extensive cards can be split up for readability and maintainability. Our standard allows for the <code>!include</code> to be used anywhere.</p>"},{"location":"projects/tad/reporting-standard/0.1a2/#specification-of-the-standard","title":"Specification of the standard","text":"<p>The standard will be written in yaml. Example yaml files are given in the next section. The standard defines three cards: a <code>system_card</code>, a <code>model_card</code> and an <code>assessment_card</code>. A <code>system_card</code> contains information about an algorithmic system. It can have multiple models and each of these models should have a <code>model_card</code>. Regulatory assessments can be processed in an <code>assessment_card</code>. Note that <code>model_card</code>'s and <code>assessment_card</code>'s can be included directly into the <code>system_card</code> or can be included as separate yaml files with help of a yaml-include mechanism. For clarity the latter is preferred and is also used in the examples in the next section.</p>"},{"location":"projects/tad/reporting-standard/0.1a2/#system_card","title":"<code>system_card</code>","text":"<p>A <code>system_card</code> contains the following information.</p> <ol> <li><code>schema_version</code> (REQUIRED, string). Version of the schema used, for example \"0.1a2\".</li> <li><code>name</code> (OPTIONAL, string). Name used to describe the system.</li> <li><code>upl</code> (OPTIONAL, string). If this algorithm is part of a product offered by the Dutch Government,     it should contain a URI from the Uniform Product List.</li> <li> <p><code>owners</code> (list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is set.</li> </ol> </li> <li> <p><code>description</code> (OPTIONAL, string). A short description of the system.</p> </li> <li> <p><code>labels</code> (OPTIONAL, list). This fields allows to store meta information about a system. There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the label.</li> <li><code>value</code> (OPTIONAL, string). Value of the label.</li> </ol> </li> <li> <p><code>status</code> (OPTIONAL, string). The status of the system. For example the status can be \"production\".</p> </li> <li><code>publication_category</code> (OPTIONAL, enum[string]). The publication category of the algorithm should be chosen from <code>[\"high_risk\", other\"]</code>.</li> <li><code>begin_date</code> (OPTIONAL, string). The first date the system was used. Date should be given in ISO 8601 format, i.e. YYYY-MM-DD.</li> <li><code>end_date</code> (OPTIONAL, string). The last date the system was used. Date should be given in ISO 8601 format, i.e. YYYY-MM-DD.</li> <li><code>goal_and_impact</code> (OPTIONAL, string). The purpose of the system and the impact it has on citizens and companies.</li> <li><code>considerations</code> (OPTIONAL, string). The pro's and con's of using the system.</li> <li><code>risk_management</code> (OPTIONAL, string). Description of the risks associated with the system.</li> <li><code>human_intervention</code> (OPTIONAL, string). A description to want extend there is human involvement in the system.</li> <li><code>legal_base</code> (OPTIONAL, list). If there exists a legal base for the process the system is embedded in, this field can be filled in with the relevant laws. There can be multiple legal bases. For each legal base the following fields are present.<ol> <li><code>name</code> (OPTIONAL, string). Name of the law.</li> <li><code>link</code> (OPTIONAL, string). URI pointing towards the contents of the law.</li> </ol> </li> <li><code>used_data</code> (OPTIONAL, string). An overview of the data that is used in the system.</li> <li><code>technical_design</code> (OPTIONAL, string). Description on how the system works.</li> <li><code>external_providers</code> (OPTIONAL, list[string]). Name of an external provider, if relevant. There can be multiple external providers.</li> <li><code>references</code> (OPTIONAL, list[string]). Additional reference URI's that point information about the system and are relevant.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a2/#1-models","title":"1. Models","text":"<ol> <li><code>models</code> (OPTIONAL, list[ModelCard]). A list of model cards (as defined below) or <code>!include</code>s of a yaml file containing a model card. This model card can for example be a model card described in the next section or a model card from Hugging Face. There can be multiple model cards, meaning multiple models are used.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a2/#2-assessments","title":"2. Assessments","text":"<ol> <li><code>assessments</code> (OPTIONAL, list[AssessmentCard]). A list of assessment cards (as defined below) or <code>!include</code>s of a yaml file containing a assessment card. This assessment card is an assessment card described in the next section. There can be multiple assessment cards, meaning multiple assessment were performed.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a2/#model_card","title":"<code>model_card</code>","text":"<p>A <code>model_card</code> contains the following information.</p> <ol> <li><code>language</code> (OPTIONAL, list[string]). If relevant, the natural languages the model supports in ISO 639.     There can be multiple languages.</li> <li> <p><code>license</code>(REQUIRED, string). Any license from the open source license list <sup>1</sup>. If the license is NOT present in the license list this field must be set to 'other' and the following two fields will be REQUIRED.</p> <ol> <li><code>license_name</code> (string). An id for the license.</li> <li><code>license_link</code> (string). A link to a file of that name inside the repo, or a URL to a remote file containing the license contents.</li> </ol> </li> <li> <p><code>tags</code> (OPTIONAL, list[string]). Tags with keywords to describe the project. There can be multiple tags.</p> </li> <li> <p><code>owners</code> (list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is set.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a2/#1-model-index","title":"1. Model Index","text":"<p>There can be multiple models. For each model the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the model.</li> <li><code>model</code> (REQUIRED, string). A URI pointing to a repository containing the model file.</li> <li> <p><code>artifacts</code> (OPTIONAL, list). A list of artifacts</p> <ol> <li><code>uri</code> (OPTIONAL, string) URI refers to a relevant model artifact</li> <li><code>content-type</code> (OPTIONAL, string) Optional type, follow the Content-Type. Recognized values are \"application/onnx\"\", to refer to an ONNX representation of the model.</li> <li><code>md5-checksum</code> (OPTIONAL, string) Optional checksum for the content of the file.</li> </ol> </li> <li> <p><code>parameters</code> (list). There can be multiple parameters. For each parameter the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the parameter, for example \"epochs\".</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the parameter, for example \"int\".</li> <li><code>value</code> (OPTIONAL, string). The value of the parameter, for example 100.</li> <li> <p><code>labels</code> (list). This field allows to store meta information about a parameter.     There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the label.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED.</li> </ol> </li> </ol> </li> <li> <p><code>results</code> (list). There can be multiple results. For each result the following fields are present.</p> <ol> <li> <p><code>task</code> (OPTIONAL, list).</p> <ol> <li><code>task_type</code> (REQUIRED, string). The task of the model, for example \"object-classification\".</li> <li><code>task_name</code> (OPTIONAL, string). A pretty name for the model tasks, for example \"Object Classification\".</li> </ol> </li> <li> <p><code>datasets</code> (list). There can be multiple datasets <sup>2</sup>. For each dataset the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the dataset, can be a dataset id from Hugging Face datasets or any other link to a repository containing the dataset<sup>3</sup>, for example \"common_voice\".</li> <li><code>name</code> (REQUIRED, string). Name pretty name for the dataset, for example \"Common Voice (French)\".</li> <li><code>split</code> (OPTIONAL, string). The split of the dataset, for example \"train\".</li> <li><code>features</code> (OPTIONAL, list[string]). List of feature names.</li> <li><code>revision</code> (OPTIONAL, string). Version of the dataset, for example 5503434ddd753f426f4b38109466949a1217c2bb.</li> </ol> </li> <li> <p><code>metrics</code> (list). There can be multiple metrics. For each metric the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). A metric-id from Hugging Face metrics<sup>4</sup>, for example accuracy.</li> <li><code>name</code> (REQUIRED, string). A descriptive name of the metric. For example \"false positive rate\" is not a descriptive name, but \"training false positive rate w.r.t class x\" is.</li> <li><code>dtype</code> (REQUIRED, string). The data type of the metric, for example <code>float</code>.</li> <li><code>value</code> (REQUIRED, string). The value of the metric.</li> <li> <p><code>labels</code> (list). This field allows to store meta information about a metric. For example, metrics can be computed for example on subgroups of specific features. For example, one can compute the accuracy for examples where the feature \"gender\" is set to \"male\". There can be multiple subgroups, which means that the metric is computed on the intersection of those subgroups. There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the feature. For example: \"gender\".</li> <li><code>type</code> (OPTIONAL, string). The type of the label. Can for example be set to \"feature\" or \"output_class\". If <code>name</code> is set, this field is REQUIRED.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature, for example <code>float</code>. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED. For example: \"male\".</li> </ol> </li> </ol> </li> <li> <p><code>measurements</code>.</p> <ol> <li> <p><code>bar_plots</code> (list). The purpose of this field is to capture bar plot like measurements, for example SHAP values. There can be multiple bar plots. For each bar plot the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of bar plot, for example \"SHAP\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name for the plot, for example \"Mean Absolute SHAP Values\".</li> <li><code>results</code> (list). The contents of the bar plot. A result represents a bar. There can be multiple results. For each result the following fields are present.<ol> <li><code>name</code> (REQUIRED, string). The name of bar.</li> <li><code>value</code> (REQUIRED, float). The value of the corresponding bar.</li> </ol> </li> </ol> </li> <li> <p><code>graph_plots</code> (list). The purpose of this field is to capture graph plot like measurements, such as partial dependence plots. There can be multiple graph plots. For each graph plot the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the graph plot, for example \"partial_dependence\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name of the graph, for example \"Partial Dependence Plot\".</li> <li><code>results</code> (list). Results contains the graph plot data. Each graph can depend on a specific output class and feature. There can be multiple results. For each result the following fields are present.<ol> <li><code>class</code> (OPTIONAL, string/int/float/bool). The output class name that the graph corresponds to. This field is not always present.</li> <li><code>feature</code> (REQUIRED, string). The feature the graph corresponds to. This is required, since all relevant graphs are dependent on features.</li> <li><code>data</code> (list)<ol> <li><code>x_value</code> (REQUIRED, float). The $x$-value of the graph.</li> <li><code>y_value</code> (REQUIRED, float). The $y$-value of the graph.</li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a2/#assessment_card","title":"<code>assessment_card</code>","text":"<p>An <code>assessment_card</code> contains the following information.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the assessment.</li> <li><code>date</code> (REQUIRED, string). The date at which the assessment is completed. Date should be given in ISO 8601 format, i.e. YYYY-MM-DD.</li> <li> <p><code>contents</code> (list). There can be multiple items in contents. For each item the following fields are present:</p> <ol> <li><code>question</code> (REQUIRED, string). A question.</li> <li><code>answer</code> (REQUIRED, string). An answer.</li> <li><code>remarks</code> (OPTIONAL, string). A field to put relevant discussion remarks in.</li> <li><code>authors</code>. There can be multiple names. For each name the following field is present.<ol> <li><code>name</code> (OPTIONAL, string). The name of the author of the question.</li> </ol> </li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date and time of the answer.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a2/#example","title":"Example","text":""},{"location":"projects/tad/reporting-standard/0.1a2/#system-card","title":"System Card","text":"<pre><code>version: {system_card_version}                          # Optional. Example: \"0.1a1\"\nname: {system_name}                                     # Optional. Example: \"AangifteVertrekBuitenland\"\nupl: {upl_uri}                                          # Optional. Example: https://standaarden.overheid.nl/owms/terms/AangifteVertrekBuitenland\nowners:\n- oin: {oin}                                            # Optional. Example: 00000001003214345000\n  organization: {organization_name}                     # Optional if oin is provided, Required otherwise. Example: BZK\n  name: {owner_name}                                    # Optional. Example: John Doe\n  email: {owner_email}                                  # Optional. Example: johndoe@email.com\n  role: {owner_role}                                    # Optional. Example: Data Scientist.\ndescription: {system_description}                       # Optional. Short description of the system.\nlabels:                                                 # Optional. Labels to store metadata about the system.\n- name: {label_name}                                    # Optional.\n  value: {label_value}                                  # Optional.\nstatus: {system_status}                                 # Optional. Example: \"production\".\npublication_category: {system_publication_cat}          # Optional. Example: \"impactful_algorithm\".\nbegin_date: {system_begin_date}                         # Optional. Example: 2025-1-1.\nend_date: {system_end_date}                             # Optional. Example: 2025-12-1.\ngoal_and_impact: {system_goal_and_impact}               # Optional. Goal and impact of the system.\nconsiderations: {system_considerations}                 # Optional. Considerations about the system.\nrisk_management: {system_risk_management}               # Optional. Description of risks associated with the system.\nhuman_intervention: {system_human_intervention}         # Optional. Description of human involvement in the system.\nlegal_base:\n- name: {law_name}                                      # Optional. Example: \"AVG\".\n  link: {law_uri}                                       # Optional. Example: \"https://eur-lex.europa.eu/legal-content/NL/TXT/HTML/?uri=CELEX:31995L0046\".\nused_data: {system_used_data}                           # Optional. Description of the data used by the system.\ntechnical_design: {technical_design}                    # Optional. Description of the technical design of the system.\nexternal_providers:\n- {system_external_provider}                            # Optional. Reference to used external providers.\nreferences:\n- {reference_uri}                                       # Optional. Example: URI to codebase.\n\nmodels:\n- !include {model_card_uri}                             # Optional. Example: cat_classifier_model.yaml.\n\nassessments:\n- !include {assessment_card_uri}                        # Required. Example: iama.yaml.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a2/#model-card","title":"Model Card","text":"<pre><code>language:\n  - {lang_0}                                            # Optional. Example nl.\nlicense: {license}                                      # Required. Example: Apache-2.0 or any license SPDX ID from https://opensource.org/license or \"other\".\nlicense_name: {license_name}                            # Optional if license != other, Required otherwise. Example: 'my-license-1.0'\nlicense_link: {license_link}                            # Optional if license != other, Required otherwise. Specify \"LICENSE\" or \"LICENSE.md\" to link to a file of that name inside the repo, or a URL to a remote file.\ntags:\n- {tag_0}                                               # Optional. Example: audio\n- {tag_1}                                               # Optional. Example: automatic-speech-recognition\nowners:\n- organization: {organization_name}                     # Required. Example: BZK\n  oin: {oin}                                            # Optional. Example: 00000001003214345000\n  name: {owner_name}                                    # Optional. Example: John Doe\n  email: {owner_email}                                  # Optional. Example: johndoe@email.com\n  role: {owner_role}                                    # Optional. Example: Data Scientist.\n\nmodel-index:\n- name: {model_id}                                      # Required. Example: CatClassifier.\n  model: {model_uri}                                    # Required. URI to a repository containing the model file.\n  artifacts:\n  - uri: {model_artifact_uri}                           # Optional. Example: \"https://github.com/MinBZK/poc-kijkdoos-wasm-models/raw/main/logres_iris/logreg_iris.onnx\"\n  - content-type: {model_artifact_type}                 # Optional. Example: \"application/onnx\".\n  - md5-checksum: {md5_checksum}                        # Optional. Example: \"120EA8A25E5D487BF68B5F7096440019\"\n  parameters:\n  - name: {parameter_name}                              # Optional. Example: \"epochs\".\n    dtype: {parameter_dtype}                            # Optional. Example: \"int\".\n    value: {parameter_value}                            # Optional. Example: 100.\n    labels:\n      - name: {label_name}                              # Optional. Example: \"gender\".\n        dtype: {label_type}                             # Optional. Example: \"string\".\n        value: {label_value}                            # Optional. Example: \"female\".\n  results:\n  - task:\n      type: {task_type}                                 # Required. Example: image-classification.\n      name: {task_name}                                 # Optional. Example: Image Classification.\n    datasets:\n      - type: {dataset_type}                            # Required. Example: common_voice. Link to a repository containing the dataset\n        name: {dataset_name}                            # Required. Example: \"Common Voice (French)\". A pretty name for the dataset.\n        split: {split}                                  # Optional. Example: \"train\".\n        features:\n         - {feature_name}                               # Optional. Example: \"gender\".\n        revision: {dataset_version}                     # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n    metrics:\n    - type: {metric_type}                               # Required. Example: false-positive-rate. Use metric id from https://hf.co/metrics.\n      name: {metric_name}                               # Required. Example: \"FPR wrt class 0 restricted to feature gender:0 and age:21\".\n      dtype: {metric_dtype}                             # Required. Example: \"float\".\n      value: {metric_value}                             # Required. Example: 0.75.\n      labels:\n        - name: {label_name}                            # Optional. Example: \"gender\".\n          type: {label_type}                            # Optional. Example: \"feature\".\n          dtype: {label_type}                           # Optional. Example: \"string\".\n          value: {label_value}                          # Optional. Example: \"female\".\n    measurements:\n      # Bar plots should be able to capture SHAP and Robustness Toolbox from AI Verify.\n      bar_plots:\n      - type: {measurement_type}                        # Required. Example: \"SHAP\".\n        name: {measurement_name}                        # Optional. Example: \"Mean Absolute Shap Values\".\n        results:\n        - name: {bar_name}                              # Required. The name of a bar.\n          value: {bar_value}                            # Required. The corresponding value.\n      # Graph plots should be able to capture graph based measurements such as partial dependence and accumulated local effect.\n      graph_plots:\n      - type: {measurement_type}                        # Required. Example: \"partial_dependence\".\n        name: {measurement_name}                        # Optional. Example: \"Partial Dependence Plot\".\n        # Results store the graph plot data. So far all plots are dependent on a combination of a specific class (sometimes) and feature (always).\n        # For example partial dependence plots are made for each feature and class.\n        results:\n         - class: {class_name}                          # Optional. Name of the output class the graph depends on.\n           feature: {feature_name}                      # Required. Name of the feature the graph depends on.\n           data:\n            - x_value: {x_value}                        # Required. The x value of the graph data.\n              y_value: {y_value}                        # Required. The y value of the graph data.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a2/#assessment-card","title":"Assessment Card","text":"<pre><code>name: {assessment_name}                               # Required. Example: IAMA.\ndate: {assessment_date}                               # Required. Example: 25-03-2025.\ncontents:\n  - question: {question_text}                         # Required. Example: \"Question 1: ...\".\n    answer: {answer_text}                             # Required. Example: \"Answer: ...\".\n    remarks: {remarks_text}                           # Optional. Example: \"Remarks: ...\".\n    authors:                                          # Optional. Example: \"['John', 'Peter']\".\n      - name: {author_name}\n    timestamp: {timestamp}                            # Optional. Example: 1711630721.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a2/#schema","title":"Schema","text":"<p>JSON schema will be added when we publish the first beta version.</p>"},{"location":"projects/tad/reporting-standard/0.1a2/#changelog","title":"Changelog","text":"<ul> <li>0.1a2: introduces typed artifacts</li> <li>0.1a1: initial draft version of this reporting standard</li> </ul> <ol> <li> <p>Deviation from the Hugging Face specification is in the License field. Hugging Face only accepts dataset id's from Hugging Face license list while we accept any license from Open Source License List.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the <code>model_index:results:dataset</code> field. Hugging Face only accepts one dataset, while we accept a list of datasets.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the Dataset Type field. Hugging Face only accepts dataset id's from Hugging Face datasets while we also allow for any url pointing to the dataset.\u00a0\u21a9\u21a9</p> </li> <li> <p>For this extension to work relevant metrics (such as for example false positive rate) have to be added to the Hugging Face metrics, possibly this can be done in our organizational namespace.\u00a0\u21a9\u21a9</p> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a3/","title":"0.1a3","text":"<p>This document describes the Transparency of Algorithmic Decision making (TAD) Reporting Standard.</p> <p>For reproducibility, governance, auditing and sharing of algorithmic systems it is essential to have a reporting standard so that information about an algorithmic system can be shared. This reporting standard describes how information about the different phases of an algorithm's life cycle can be reported. It contains, among other things, descriptive information combined with information about the technical tests and assessments applied.</p> <p>Disclaimer</p> <p>The TAD Reporting Standard is work in progress. This means that the current standard is probably suboptimal and will change significantly in future versions.</p>"},{"location":"projects/tad/reporting-standard/0.1a3/#introduction","title":"Introduction","text":"<p>Inspired by Model Cards for Model Reporting and Papers with Code Model Index this standard almost <sup>1</sup> <sup>2</sup> <sup>3</sup> <sup>4</sup> extends the Hugging Face model card metadata specification to allow for:</p> <ol> <li>More fine-grained information on performance metrics, by extending the <code>metrics_field</code> from the Hugging Face metadata specification.</li> <li>Capturing additional measurements on fairness and bias, which can be partitioned into bar plot like measurements (such as mean absolute SHAP values) and graph plot like measurements (such as partial dependence). This is achieved by defining a new field <code>measurements</code>.</li> <li>Capturing assessments (such as IAMA and ALTAI). This is achieved by defining a new field <code>assessments</code>.</li> </ol> <p>Following Hugging Face, this proposed standard will be written in yaml.</p> <p>This standard does not contain all fields present in the Hugging Face metadata specification. The fields that are optional in the Hugging Face specification and are specific to the Hugging Face interface are omitted.</p> <p>Another difference is that we divide our implementation into three separate parts.</p> <ol> <li><code>system_card</code>, containing information about a group of ML-models which accomplish a specific task.</li> <li><code>model_card</code>, containing information about a specific data science model.</li> <li><code>assessment_card</code>, containing information about a regulatory assessment.</li> </ol> <p>Include statements</p> <p>These <code>model_card</code>s and  <code>assessment_card</code>s  can be included verbatim into a <code>system_card</code>, or referenced with an <code>!include</code> statement, allowing for minimal cards to be compact in a single file. Extensive cards can be split up for readability and maintainability. Our standard allows for the <code>!include</code> to be used anywhere.</p>"},{"location":"projects/tad/reporting-standard/0.1a3/#specification-of-the-standard","title":"Specification of the standard","text":"<p>The standard will be written in yaml. Example yaml files are given in the next section. The standard defines three cards: a <code>system_card</code>, a <code>model_card</code> and an <code>assessment_card</code>. A <code>system_card</code> contains information about an algorithmic system. It can have multiple models and each of these models should have a <code>model_card</code>. Regulatory assessments can be processed in an <code>assessment_card</code>. Note that <code>model_card</code>'s and <code>assessment_card</code>'s can be included directly into the <code>system_card</code> or can be included as separate yaml files with help of a yaml-include mechanism. For clarity the latter is preferred and is also used in the examples in the next section.</p>"},{"location":"projects/tad/reporting-standard/0.1a3/#system_card","title":"<code>system_card</code>","text":"<p>A <code>system_card</code> contains the following information.</p> <ol> <li><code>schema_version</code> (REQUIRED, string). Version of the schema used, for example \"0.1a2\".</li> <li><code>name</code> (OPTIONAL, string). Name used to describe the system.</li> <li><code>upl</code> (OPTIONAL, string). If this algorithm is part of a product offered by the Dutch Government,     it should contain a URI from the Uniform Product List.</li> <li> <p><code>owners</code> (list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is set.</li> </ol> </li> <li> <p><code>description</code> (OPTIONAL, string). A short description of the system.</p> </li> <li> <p><code>labels</code> (OPTIONAL, list). This fields allows to store meta information about a system. There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the label.</li> <li><code>value</code> (OPTIONAL, string). Value of the label.</li> </ol> </li> <li> <p><code>status</code> (OPTIONAL, string). The status of the system. For example the status can be \"production\".</p> </li> <li><code>publication_category</code> (OPTIONAL, enum[string]). The publication category of the algorithm should be chosen from <code>[\"high_risk\", other\"]</code>.</li> <li><code>begin_date</code> (OPTIONAL, string). The first date the system was used. Date should be given in ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>end_date</code> (OPTIONAL, string). The last date the system was used. Date should be given in ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>goal_and_impact</code> (OPTIONAL, string). The purpose of the system and the impact it has on citizens and companies.</li> <li><code>considerations</code> (OPTIONAL, string). The pro's and con's of using the system.</li> <li><code>risk_management</code> (OPTIONAL, string). Description of the risks associated with the system.</li> <li><code>human_intervention</code> (OPTIONAL, string). A description to want extend there is human involvement in the system.</li> <li><code>legal_base</code> (OPTIONAL, list). If there exists a legal base for the process the system is embedded in, this field can be filled in with the relevant laws. There can be multiple legal bases. For each legal base the following fields are present.<ol> <li><code>name</code> (OPTIONAL, string). Name of the law.</li> <li><code>link</code> (OPTIONAL, string). URI pointing towards the contents of the law.</li> </ol> </li> <li><code>used_data</code> (OPTIONAL, string). An overview of the data that is used in the system.</li> <li><code>technical_design</code> (OPTIONAL, string). Description on how the system works.</li> <li><code>external_providers</code> (OPTIONAL, list[string]). Name of an external provider, if relevant. There can be multiple external providers.</li> <li><code>references</code> (OPTIONAL, list[string]). Additional reference URI's that point information about the system and are relevant.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a3/#1-models","title":"1. Models","text":"<ol> <li><code>models</code> (OPTIONAL, list[ModelCard]). A list of model cards (as defined below) or <code>!include</code>s of a yaml file containing a model card. This model card can for example be a model card described in the next section or a model card from Hugging Face. There can be multiple model cards, meaning multiple models are used.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a3/#2-assessments","title":"2. Assessments","text":"<ol> <li><code>assessments</code> (OPTIONAL, list[AssessmentCard]). A list of assessment cards (as defined below) or <code>!include</code>s of a yaml file containing a assessment card. This assessment card is an assessment card described in the next section. There can be multiple assessment cards, meaning multiple assessment were performed.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a3/#model_card","title":"<code>model_card</code>","text":"<p>A <code>model_card</code> contains the following information.</p> <ol> <li><code>language</code> (OPTIONAL, list[string]). If relevant, the natural languages the model supports in ISO 639.     There can be multiple languages.</li> <li> <p><code>license</code>(REQUIRED, string). Any license from the open source license list <sup>1</sup>. If the license is NOT present in the license list this field must be set to 'other' and the following two fields will be REQUIRED.</p> <ol> <li><code>license_name</code> (string). An id for the license.</li> <li><code>license_link</code> (string). A link to a file of that name inside the repo, or a URL to a remote file containing the license contents.</li> </ol> </li> <li> <p><code>tags</code> (OPTIONAL, list[string]). Tags with keywords to describe the project. There can be multiple tags.</p> </li> <li> <p><code>owners</code> (list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is set.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a3/#1-model-index","title":"1. Model Index","text":"<p>There can be multiple models. For each model the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the model.</li> <li><code>model</code> (REQUIRED, string). A URI pointing to a repository containing the model file.</li> <li> <p><code>artifacts</code> (OPTIONAL, list). A list of artifacts</p> <ol> <li><code>uri</code> (OPTIONAL, string) URI refers to a relevant model artifact</li> <li><code>content-type</code> (OPTIONAL, string) Optional type, follow the Content-Type. Recognized values are \"application/onnx\"\", to refer to an ONNX representation of the model.</li> <li><code>md5-checksum</code> (OPTIONAL, string) Optional checksum for the content of the file.</li> </ol> </li> <li> <p><code>parameters</code> (list). There can be multiple parameters. For each parameter the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the parameter, for example \"epochs\".</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the parameter, for example \"int\".</li> <li><code>value</code> (OPTIONAL, string). The value of the parameter, for example 100.</li> <li> <p><code>labels</code> (list). This field allows to store meta information about a parameter.     There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the label.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED.</li> </ol> </li> </ol> </li> <li> <p><code>results</code> (list). There can be multiple results. For each result the following fields are present.</p> <ol> <li> <p><code>task</code> (OPTIONAL, list).</p> <ol> <li><code>task_type</code> (REQUIRED, string). The task of the model, for example \"object-classification\".</li> <li><code>task_name</code> (OPTIONAL, string). A pretty name for the model tasks, for example \"Object Classification\".</li> </ol> </li> <li> <p><code>datasets</code> (list). There can be multiple datasets <sup>2</sup>. For each dataset the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the dataset, can be a dataset id from Hugging Face datasets or any other link to a repository containing the dataset<sup>3</sup>, for example \"common_voice\".</li> <li><code>name</code> (REQUIRED, string). Name pretty name for the dataset, for example \"Common Voice (French)\".</li> <li><code>split</code> (OPTIONAL, string). The split of the dataset, for example \"train\".</li> <li><code>features</code> (OPTIONAL, list[string]). List of feature names.</li> <li><code>revision</code> (OPTIONAL, string). Version of the dataset, for example 5503434ddd753f426f4b38109466949a1217c2bb.</li> </ol> </li> <li> <p><code>metrics</code> (list). There can be multiple metrics. For each metric the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). A metric-id from Hugging Face metrics<sup>4</sup>, for example accuracy.</li> <li><code>name</code> (REQUIRED, string). A descriptive name of the metric. For example \"false positive rate\" is not a descriptive name, but \"training false positive rate w.r.t class x\" is.</li> <li><code>dtype</code> (REQUIRED, string). The data type of the metric, for example <code>float</code>.</li> <li><code>value</code> (REQUIRED, string). The value of the metric.</li> <li> <p><code>labels</code> (list). This field allows to store meta information about a metric. For example, metrics can be computed for example on subgroups of specific features. For example, one can compute the accuracy for examples where the feature \"gender\" is set to \"male\". There can be multiple subgroups, which means that the metric is computed on the intersection of those subgroups. There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the feature. For example: \"gender\".</li> <li><code>type</code> (OPTIONAL, string). The type of the label. Can for example be set to \"feature\" or \"output_class\". If <code>name</code> is set, this field is REQUIRED.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature, for example <code>float</code>. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED. For example: \"male\".</li> </ol> </li> </ol> </li> <li> <p><code>measurements</code>.</p> <ol> <li> <p><code>bar_plots</code> (list). The purpose of this field is to capture bar plot like measurements, for example SHAP values. There can be multiple bar plots. For each bar plot the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of bar plot, for example \"SHAP\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name for the plot, for example \"Mean Absolute SHAP Values\".</li> <li><code>results</code> (list). The contents of the bar plot. A result represents a bar. There can be multiple results. For each result the following fields are present.<ol> <li><code>name</code> (REQUIRED, string). The name of bar.</li> <li><code>value</code> (REQUIRED, float). The value of the corresponding bar.</li> </ol> </li> </ol> </li> <li> <p><code>graph_plots</code> (list). The purpose of this field is to capture graph plot like measurements, such as partial dependence plots. There can be multiple graph plots. For each graph plot the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the graph plot, for example \"partial_dependence\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name of the graph, for example \"Partial Dependence Plot\".</li> <li><code>results</code> (list). Results contains the graph plot data. Each graph can depend on a specific output class and feature. There can be multiple results. For each result the following fields are present.<ol> <li><code>class</code> (OPTIONAL, string/int/float/bool). The output class name that the graph corresponds to. This field is not always present.</li> <li><code>feature</code> (REQUIRED, string). The feature the graph corresponds to. This is required, since all relevant graphs are dependent on features.</li> <li><code>data</code> (list)<ol> <li><code>x_value</code> (REQUIRED, float). The $x$-value of the graph.</li> <li><code>y_value</code> (REQUIRED, float). The $y$-value of the graph.</li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a3/#assessment_card","title":"<code>assessment_card</code>","text":"<p>An <code>assessment_card</code> contains the following information.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the assessment.</li> <li><code>date</code> (REQUIRED, string). The date at which the assessment is completed. Date should be given in ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li> <p><code>contents</code> (list). There can be multiple items in contents. For each item the following fields are present:</p> <ol> <li><code>question</code> (REQUIRED, string). A question.</li> <li><code>answer</code> (REQUIRED, string). An answer.</li> <li><code>remarks</code> (OPTIONAL, string). A field to put relevant discussion remarks in.</li> <li><code>authors</code>. There can be multiple names. For each name the following field is present.<ol> <li><code>name</code> (OPTIONAL, string). The name of the author of the question.</li> </ol> </li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of the answer. Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a3/#example","title":"Example","text":""},{"location":"projects/tad/reporting-standard/0.1a3/#system-card","title":"System Card","text":"<pre><code>version: {system_card_version}                          # Optional. Example: \"0.1a1\"\nname: {system_name}                                     # Optional. Example: \"AangifteVertrekBuitenland\"\nupl: {upl_uri}                                          # Optional. Example: https://standaarden.overheid.nl/owms/terms/AangifteVertrekBuitenland\nowners:\n- oin: {oin}                                            # Optional. Example: 00000001003214345000\n  organization: {organization_name}                     # Optional if oin is provided, Required otherwise. Example: BZK\n  name: {owner_name}                                    # Optional. Example: John Doe\n  email: {owner_email}                                  # Optional. Example: johndoe@email.com\n  role: {owner_role}                                    # Optional. Example: Data Scientist.\ndescription: {system_description}                       # Optional. Short description of the system.\nlabels:                                                 # Optional. Labels to store metadata about the system.\n- name: {label_name}                                    # Optional.\n  value: {label_value}                                  # Optional.\nstatus: {system_status}                                 # Optional. Example: \"production\".\npublication_category: {system_publication_cat}          # Optional. Example: \"impactful_algorithm\".\nbegin_date: {system_begin_date}                         # Optional. Example: 2025-1-1.\nend_date: {system_end_date}                             # Optional. Example: 2025-12-1.\ngoal_and_impact: {system_goal_and_impact}               # Optional. Goal and impact of the system.\nconsiderations: {system_considerations}                 # Optional. Considerations about the system.\nrisk_management: {system_risk_management}               # Optional. Description of risks associated with the system.\nhuman_intervention: {system_human_intervention}         # Optional. Description of human involvement in the system.\nlegal_base:\n- name: {law_name}                                      # Optional. Example: \"AVG\".\n  link: {law_uri}                                       # Optional. Example: \"https://eur-lex.europa.eu/legal-content/NL/TXT/HTML/?uri=CELEX:31995L0046\".\nused_data: {system_used_data}                           # Optional. Description of the data used by the system.\ntechnical_design: {technical_design}                    # Optional. Description of the technical design of the system.\nexternal_providers:\n- {system_external_provider}                            # Optional. Reference to used external providers.\nreferences:\n- {reference_uri}                                       # Optional. Example: URI to codebase.\n\nmodels:\n- !include {model_card_uri}                             # Optional. Example: cat_classifier_model.yaml.\n\nassessments:\n- !include {assessment_card_uri}                        # Required. Example: iama.yaml.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a3/#model-card","title":"Model Card","text":"<pre><code>language:\n  - {lang_0}                                            # Optional. Example nl.\nlicense: {license}                                      # Required. Example: Apache-2.0 or any license SPDX ID from https://opensource.org/license or \"other\".\nlicense_name: {license_name}                            # Optional if license != other, Required otherwise. Example: 'my-license-1.0'\nlicense_link: {license_link}                            # Optional if license != other, Required otherwise. Specify \"LICENSE\" or \"LICENSE.md\" to link to a file of that name inside the repo, or a URL to a remote file.\ntags:\n- {tag_0}                                               # Optional. Example: audio\n- {tag_1}                                               # Optional. Example: automatic-speech-recognition\nowners:\n- organization: {organization_name}                     # Required. Example: BZK\n  oin: {oin}                                            # Optional. Example: 00000001003214345000\n  name: {owner_name}                                    # Optional. Example: John Doe\n  email: {owner_email}                                  # Optional. Example: johndoe@email.com\n  role: {owner_role}                                    # Optional. Example: Data Scientist.\n\nmodel-index:\n- name: {model_id}                                      # Required. Example: CatClassifier.\n  model: {model_uri}                                    # Required. URI to a repository containing the model file.\n  artifacts:\n  - uri: {model_artifact_uri}                           # Optional. Example: \"https://github.com/MinBZK/poc-kijkdoos-wasm-models/raw/main/logres_iris/logreg_iris.onnx\"\n  - content-type: {model_artifact_type}                 # Optional. Example: \"application/onnx\".\n  - md5-checksum: {md5_checksum}                        # Optional. Example: \"120EA8A25E5D487BF68B5F7096440019\"\n  parameters:\n  - name: {parameter_name}                              # Optional. Example: \"epochs\".\n    dtype: {parameter_dtype}                            # Optional. Example: \"int\".\n    value: {parameter_value}                            # Optional. Example: 100.\n    labels:\n      - name: {label_name}                              # Optional. Example: \"gender\".\n        dtype: {label_type}                             # Optional. Example: \"string\".\n        value: {label_value}                            # Optional. Example: \"female\".\n  results:\n  - task:\n      type: {task_type}                                 # Required. Example: image-classification.\n      name: {task_name}                                 # Optional. Example: Image Classification.\n    datasets:\n      - type: {dataset_type}                            # Required. Example: common_voice. Link to a repository containing the dataset\n        name: {dataset_name}                            # Required. Example: \"Common Voice (French)\". A pretty name for the dataset.\n        split: {split}                                  # Optional. Example: \"train\".\n        features:\n         - {feature_name}                               # Optional. Example: \"gender\".\n        revision: {dataset_version}                     # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n    metrics:\n    - type: {metric_type}                               # Required. Example: false-positive-rate. Use metric id from https://hf.co/metrics.\n      name: {metric_name}                               # Required. Example: \"FPR wrt class 0 restricted to feature gender:0 and age:21\".\n      dtype: {metric_dtype}                             # Required. Example: \"float\".\n      value: {metric_value}                             # Required. Example: 0.75.\n      labels:\n        - name: {label_name}                            # Optional. Example: \"gender\".\n          type: {label_type}                            # Optional. Example: \"feature\".\n          dtype: {label_type}                           # Optional. Example: \"string\".\n          value: {label_value}                          # Optional. Example: \"female\".\n    measurements:\n      # Bar plots should be able to capture SHAP and Robustness Toolbox from AI Verify.\n      bar_plots:\n      - type: {measurement_type}                        # Required. Example: \"SHAP\".\n        name: {measurement_name}                        # Optional. Example: \"Mean Absolute Shap Values\".\n        results:\n        - name: {bar_name}                              # Required. The name of a bar.\n          value: {bar_value}                            # Required. The corresponding value.\n      # Graph plots should be able to capture graph based measurements such as partial dependence and accumulated local effect.\n      graph_plots:\n      - type: {measurement_type}                        # Required. Example: \"partial_dependence\".\n        name: {measurement_name}                        # Optional. Example: \"Partial Dependence Plot\".\n        # Results store the graph plot data. So far all plots are dependent on a combination of a specific class (sometimes) and feature (always).\n        # For example partial dependence plots are made for each feature and class.\n        results:\n         - class: {class_name}                          # Optional. Name of the output class the graph depends on.\n           feature: {feature_name}                      # Required. Name of the feature the graph depends on.\n           data:\n            - x_value: {x_value}                        # Required. The x value of the graph data.\n              y_value: {y_value}                        # Required. The y value of the graph data.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a3/#assessment-card","title":"Assessment Card","text":"<pre><code>name: {assessment_name}                               # Required. Example: IAMA.\ndate: {assessment_date}                               # Required. Example: 25-03-2025.\ncontents:\n  - question: {question_text}                         # Required. Example: \"Question 1: ...\".\n    answer: {answer_text}                             # Required. Example: \"Answer: ...\".\n    remarks: {remarks_text}                           # Optional. Example: \"Remarks: ...\".\n    authors:                                          # Optional. Example: \"['John', 'Peter']\".\n      - name: {author_name}\n    timestamp: {timestamp}                            # Optional. Example: 2024-04-16T16:48:14Z.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a3/#schema","title":"Schema","text":"<p>JSON schema will be added when we publish the first beta version.</p>"},{"location":"projects/tad/reporting-standard/0.1a3/#changelog","title":"Changelog","text":"<ul> <li>0.1a3: require ISO 8601 timestamp</li> <li>0.1a2: introduces typed artifacts</li> <li>0.1a1: initial draft version of this reporting standard</li> </ul> <ol> <li> <p>Deviation from the Hugging Face specification is in the License field. Hugging Face only accepts dataset id's from Hugging Face license list while we accept any license from Open Source License List.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the <code>model_index:results:dataset</code> field. Hugging Face only accepts one dataset, while we accept a list of datasets.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the Dataset Type field. Hugging Face only accepts dataset id's from Hugging Face datasets while we also allow for any url pointing to the dataset.\u00a0\u21a9\u21a9</p> </li> <li> <p>For this extension to work relevant metrics (such as for example false positive rate) have to be added to the Hugging Face metrics, possibly this can be done in our organizational namespace.\u00a0\u21a9\u21a9</p> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a4/","title":"0.1a4","text":"<p>This document describes the Transparency of Algorithmic Decision making (TAD) Reporting Standard.</p> <p>For reproducibility, governance, auditing and sharing of algorithmic systems it is essential to have a reporting standard so that information about an algorithmic system can be shared. This reporting standard describes how information about the different phases of an algorithm's life cycle can be reported. It contains, among other things, descriptive information combined with information about the technical tests and assessments applied.</p> <p>Disclaimer</p> <p>The TAD Reporting Standard is work in progress. This means that the current standard is probably suboptimal and will change significantly in future versions.</p>"},{"location":"projects/tad/reporting-standard/0.1a4/#introduction","title":"Introduction","text":"<p>Inspired by Model Cards for Model Reporting and Papers with Code Model Index this standard almost <sup>1</sup> <sup>2</sup> <sup>3</sup> <sup>4</sup> extends the Hugging Face model card metadata specification to allow for:</p> <ol> <li>More fine-grained information on performance metrics, by extending the <code>metrics_field</code> from the Hugging Face metadata specification.</li> <li>Capturing additional measurements on fairness and bias, which can be partitioned into bar plot like measurements (such as mean absolute SHAP values) and graph plot like measurements (such as partial dependence). This is achieved by defining a new field <code>measurements</code>.</li> <li>Capturing assessments (such as IAMA and ALTAI). This is achieved by defining a new field <code>assessments</code>.</li> </ol> <p>Following Hugging Face, this proposed standard will be written in yaml.</p> <p>This standard does not contain all fields present in the Hugging Face metadata specification. The fields that are optional in the Hugging Face specification and are specific to the Hugging Face interface are omitted.</p> <p>Another difference is that we divide our implementation into three separate parts.</p> <ol> <li><code>system_card</code>, containing information about a group of ML-models which accomplish a specific task.</li> <li><code>model_card</code>, containing information about a specific data science model.</li> <li><code>assessment_card</code>, containing information about a regulatory assessment.</li> </ol> <p>Include statements</p> <p>These <code>model_card</code>s and  <code>assessment_card</code>s  can be included verbatim into a <code>system_card</code>, or referenced with an <code>!include</code> statement, allowing for minimal cards to be compact in a single file. Extensive cards can be split up for readability and maintainability. Our standard allows for the <code>!include</code> to be used anywhere.</p>"},{"location":"projects/tad/reporting-standard/0.1a4/#specification-of-the-standard","title":"Specification of the standard","text":"<p>The standard will be written in yaml. Example yaml files are given in the next section. The standard defines three cards: a <code>system_card</code>, a <code>model_card</code> and an <code>assessment_card</code>. A <code>system_card</code> contains information about an algorithmic system. It can have multiple models and each of these models should have a <code>model_card</code>. Regulatory assessments can be processed in an <code>assessment_card</code>. Note that <code>model_card</code>'s and <code>assessment_card</code>'s can be included directly into the <code>system_card</code> or can be included as separate yaml files with help of a yaml-include mechanism. For clarity the latter is preferred and is also used in the examples in the next section.</p>"},{"location":"projects/tad/reporting-standard/0.1a4/#system_card","title":"<code>system_card</code>","text":"<p>A <code>system_card</code> contains the following information.</p> <ol> <li><code>schema_version</code> (REQUIRED, string). Version of the schema used, for example \"0.1a2\".</li> <li> <p><code>provenance</code> (OPTIONAL). In case this System Card is generated from another source file, this field can capture the historical context of the contents of this System Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card. Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>name</code> (OPTIONAL, string). Name used to describe the system.</p> </li> <li><code>upl</code> (OPTIONAL, string). If this algorithm is part of a product offered by the Dutch Government,     it should contain a URI from the Uniform Product List.</li> <li> <p><code>owners</code> (list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is set.</li> </ol> </li> <li> <p><code>description</code> (OPTIONAL, string). A short description of the system.</p> </li> <li> <p><code>labels</code> (OPTIONAL, list). This fields allows to store meta information about a system. There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the label.</li> <li><code>value</code> (OPTIONAL, string). Value of the label.</li> </ol> </li> <li> <p><code>status</code> (OPTIONAL, string). The status of the system. For example the status can be \"production\".</p> </li> <li><code>publication_category</code> (OPTIONAL, enum[string]). The publication category of the algorithm should be chosen from <code>[\"high_risk\", other\"]</code>.</li> <li><code>begin_date</code> (OPTIONAL, string). The first date the system was used. Date should be given in ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>end_date</code> (OPTIONAL, string). The last date the system was used. Date should be given in ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>goal_and_impact</code> (OPTIONAL, string). The purpose of the system and the impact it has on citizens and companies.</li> <li><code>considerations</code> (OPTIONAL, string). The pro's and con's of using the system.</li> <li><code>risk_management</code> (OPTIONAL, string). Description of the risks associated with the system.</li> <li><code>human_intervention</code> (OPTIONAL, string). A description to want extend there is human involvement in the system.</li> <li><code>legal_base</code> (OPTIONAL, list). If there exists a legal base for the process the system is embedded in, this field can be filled in with the relevant laws. There can be multiple legal bases. For each legal base the following fields are present.<ol> <li><code>name</code> (OPTIONAL, string). Name of the law.</li> <li><code>link</code> (OPTIONAL, string). URI pointing towards the contents of the law.</li> </ol> </li> <li><code>used_data</code> (OPTIONAL, string). An overview of the data that is used in the system.</li> <li><code>technical_design</code> (OPTIONAL, string). Description on how the system works.</li> <li><code>external_providers</code> (OPTIONAL, list[string]). Name of an external provider, if relevant. There can be multiple external providers.</li> <li><code>references</code> (OPTIONAL, list[string]). Additional reference URI's that point information about the system and are relevant.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a4/#1-models","title":"1. Models","text":"<ol> <li><code>models</code> (OPTIONAL, list[ModelCard]). A list of model cards (as defined below) or <code>!include</code>s of a yaml file containing a model card. This model card can for example be a model card described in the next section or a model card from Hugging Face. There can be multiple model cards, meaning multiple models are used.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a4/#2-assessments","title":"2. Assessments","text":"<ol> <li><code>assessments</code> (OPTIONAL, list[AssessmentCard]). A list of assessment cards (as defined below) or <code>!include</code>s of a yaml file containing a assessment card. This assessment card is an assessment card described in the next section. There can be multiple assessment cards, meaning multiple assessment were performed.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a4/#model_card","title":"<code>model_card</code>","text":"<p>A <code>model_card</code> contains the following information.</p> <ol> <li> <p><code>provenance</code> (OPTIONAL). In case this Model Card is generated from another source file, this field can capture the historical context of the contents of this Model Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card. Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>language</code> (OPTIONAL, list[string]). If relevant, the natural languages the model supports in ISO 639.     There can be multiple languages.</p> </li> <li> <p><code>license</code>(REQUIRED, string). Any license from the open source license list <sup>1</sup>. If the license is NOT present in the license list this field must be set to 'other' and the following two fields will be REQUIRED.</p> <ol> <li><code>license_name</code> (string). An id for the license.</li> <li><code>license_link</code> (string). A link to a file of that name inside the repo, or a URL to a remote file containing the license contents.</li> </ol> </li> <li> <p><code>tags</code> (OPTIONAL, list[string]). Tags with keywords to describe the project. There can be multiple tags.</p> </li> <li> <p><code>owners</code> (list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is set.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a4/#1-model-index","title":"1. Model Index","text":"<p>There can be multiple models. For each model the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the model.</li> <li><code>model</code> (REQUIRED, string). A URI pointing to a repository containing the model file.</li> <li> <p><code>artifacts</code> (OPTIONAL, list). A list of artifacts</p> <ol> <li><code>uri</code> (OPTIONAL, string) URI refers to a relevant model artifact</li> <li><code>content-type</code> (OPTIONAL, string) Optional type, follow the Content-Type. Recognized values are \"application/onnx\"\", to refer to an ONNX representation of the model.</li> <li><code>md5-checksum</code> (OPTIONAL, string) Optional checksum for the content of the file.</li> </ol> </li> <li> <p><code>parameters</code> (list). There can be multiple parameters. For each parameter the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the parameter, for example \"epochs\".</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the parameter, for example \"int\".</li> <li><code>value</code> (OPTIONAL, string). The value of the parameter, for example 100.</li> <li> <p><code>labels</code> (list). This field allows to store meta information about a parameter.     There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the label.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED.</li> </ol> </li> </ol> </li> <li> <p><code>results</code> (list). There can be multiple results. For each result the following fields are present.</p> <ol> <li> <p><code>task</code> (OPTIONAL, list).</p> <ol> <li><code>task_type</code> (REQUIRED, string). The task of the model, for example \"object-classification\".</li> <li><code>task_name</code> (OPTIONAL, string). A pretty name for the model tasks, for example \"Object Classification\".</li> </ol> </li> <li> <p><code>datasets</code> (list). There can be multiple datasets <sup>2</sup>. For each dataset the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the dataset, can be a dataset id from Hugging Face datasets or any other link to a repository containing the dataset<sup>3</sup>, for example \"common_voice\".</li> <li><code>name</code> (REQUIRED, string). Name pretty name for the dataset, for example \"Common Voice (French)\".</li> <li><code>split</code> (OPTIONAL, string). The split of the dataset, for example \"train\".</li> <li><code>features</code> (OPTIONAL, list[string]). List of feature names.</li> <li><code>revision</code> (OPTIONAL, string). Version of the dataset, for example 5503434ddd753f426f4b38109466949a1217c2bb.</li> </ol> </li> <li> <p><code>metrics</code> (list). There can be multiple metrics. For each metric the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). A metric-id from Hugging Face metrics<sup>4</sup>, for example accuracy.</li> <li><code>name</code> (REQUIRED, string). A descriptive name of the metric. For example \"false positive rate\" is not a descriptive name, but \"training false positive rate w.r.t class x\" is.</li> <li><code>dtype</code> (REQUIRED, string). The data type of the metric, for example <code>float</code>.</li> <li><code>value</code> (REQUIRED, string). The value of the metric.</li> <li> <p><code>labels</code> (list). This field allows to store meta information about a metric. For example, metrics can be computed for example on subgroups of specific features. For example, one can compute the accuracy for examples where the feature \"gender\" is set to \"male\". There can be multiple subgroups, which means that the metric is computed on the intersection of those subgroups. There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the feature. For example: \"gender\".</li> <li><code>type</code> (OPTIONAL, string). The type of the label. Can for example be set to \"feature\" or \"output_class\". If <code>name</code> is set, this field is REQUIRED.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature, for example <code>float</code>. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED. For example: \"male\".</li> </ol> </li> </ol> </li> <li> <p><code>measurements</code>.</p> <ol> <li> <p><code>bar_plots</code> (list). The purpose of this field is to capture bar plot like measurements, for example SHAP values. There can be multiple bar plots. For each bar plot the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of bar plot, for example \"SHAP\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name for the plot, for example \"Mean Absolute SHAP Values\".</li> <li><code>results</code> (list). The contents of the bar plot. A result represents a bar. There can be multiple results. For each result the following fields are present.<ol> <li><code>name</code> (REQUIRED, string). The name of bar.</li> <li><code>value</code> (REQUIRED, float). The value of the corresponding bar.</li> </ol> </li> </ol> </li> <li> <p><code>graph_plots</code> (list). The purpose of this field is to capture graph plot like measurements, such as partial dependence plots. There can be multiple graph plots. For each graph plot the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the graph plot, for example \"partial_dependence\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name of the graph, for example \"Partial Dependence Plot\".</li> <li><code>results</code> (list). Results contains the graph plot data. Each graph can depend on a specific output class and feature. There can be multiple results. For each result the following fields are present.<ol> <li><code>class</code> (OPTIONAL, string/int/float/bool). The output class name that the graph corresponds to. This field is not always present.</li> <li><code>feature</code> (REQUIRED, string). The feature the graph corresponds to. This is required, since all relevant graphs are dependent on features.</li> <li><code>data</code> (list)<ol> <li><code>x_value</code> (REQUIRED, float). The $x$-value of the graph.</li> <li><code>y_value</code> (REQUIRED, float). The $y$-value of the graph.</li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a4/#assessment_card","title":"<code>assessment_card</code>","text":"<p>An <code>assessment_card</code> contains the following information.</p> <ol> <li> <p><code>provenance</code> (OPTIONAL). In case this Assessment Card is generated from another source file, this field can capture the historical context of the contents of this Assessment Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card. Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>name</code> (REQUIRED, string). The name of the assessment.</p> </li> <li><code>date</code> (REQUIRED, string). The date at which the assessment is completed. Date should be given in ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li> <p><code>contents</code> (list). There can be multiple items in contents. For each item the following fields are present:</p> <ol> <li><code>question</code> (REQUIRED, string). A question.</li> <li><code>answer</code> (REQUIRED, string). An answer.</li> <li><code>remarks</code> (OPTIONAL, string). A field to put relevant discussion remarks in.</li> <li><code>authors</code>. There can be multiple names. For each name the following field is present.<ol> <li><code>name</code> (OPTIONAL, string). The name of the author of the question.</li> </ol> </li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of the answer. Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a4/#example","title":"Example","text":""},{"location":"projects/tad/reporting-standard/0.1a4/#system-card","title":"System Card","text":"<pre><code>version: {system_card_version}                          # Optional. Example: \"0.1a1\"\nprovenance:                                             # Optional.\n  git_commit_hash: {git_commit_hash}                    # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n  timestamp: {modification_timestamp}                   # Optional. Example: 2024-04-16T16:48:14Z.\n  uri: {modification_uri}                               # Optional. Example: https://github.com/MinBZK/tad-conversion-tool\n  author: {modification_author}                         # Optional. Example: John Doe\nname: {system_name}                                     # Optional. Example: \"AangifteVertrekBuitenland\"\nupl: {upl_uri}                                          # Optional. Example: https://standaarden.overheid.nl/owms/terms/AangifteVertrekBuitenland\nowners:\n- oin: {oin}                                            # Optional. Example: 00000001003214345000\n  organization: {organization_name}                     # Optional if oin is provided, Required otherwise. Example: BZK\n  name: {owner_name}                                    # Optional. Example: John Doe\n  email: {owner_email}                                  # Optional. Example: johndoe@email.com\n  role: {owner_role}                                    # Optional. Example: Data Scientist.\ndescription: {system_description}                       # Optional. Short description of the system.\nlabels:                                                 # Optional. Labels to store metadata about the system.\n- name: {label_name}                                    # Optional.\n  value: {label_value}                                  # Optional.\nstatus: {system_status}                                 # Optional. Example: \"production\".\npublication_category: {system_publication_cat}          # Optional. Example: \"impactful_algorithm\".\nbegin_date: {system_begin_date}                         # Optional. Example: 2025-1-1.\nend_date: {system_end_date}                             # Optional. Example: 2025-12-1.\ngoal_and_impact: {system_goal_and_impact}               # Optional. Goal and impact of the system.\nconsiderations: {system_considerations}                 # Optional. Considerations about the system.\nrisk_management: {system_risk_management}               # Optional. Description of risks associated with the system.\nhuman_intervention: {system_human_intervention}         # Optional. Description of human involvement in the system.\nlegal_base:\n- name: {law_name}                                      # Optional. Example: \"AVG\".\n  link: {law_uri}                                       # Optional. Example: \"https://eur-lex.europa.eu/legal-content/NL/TXT/HTML/?uri=CELEX:31995L0046\".\nused_data: {system_used_data}                           # Optional. Description of the data used by the system.\ntechnical_design: {technical_design}                    # Optional. Description of the technical design of the system.\nexternal_providers:\n- {system_external_provider}                            # Optional. Reference to used external providers.\nreferences:\n- {reference_uri}                                       # Optional. Example: URI to codebase.\n\nmodels:\n- !include {model_card_uri}                             # Optional. Example: cat_classifier_model.yaml.\n\nassessments:\n- !include {assessment_card_uri}                        # Required. Example: iama.yaml.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a4/#model-card","title":"Model Card","text":"<pre><code>provenance:                                             # Optional.\n  git_commit_hash: {git_commit_hash}                    # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n  timestamp: {modification_timestamp}                   # Optional. Example: 2024-04-16T16:48:14Z.\n  uri: {modification_uri}                               # Optional. Example: https://github.com/MinBZK/tad-conversion-tool\n  author: {modification_author}                         # Optional. Example: John Doe\nlanguage:\n  - {lang_0}                                            # Optional. Example nl.\nlicense: {license}                                      # Required. Example: Apache-2.0 or any license SPDX ID from https://opensource.org/license or \"other\".\nlicense_name: {license_name}                            # Optional if license != other, Required otherwise. Example: 'my-license-1.0'\nlicense_link: {license_link}                            # Optional if license != other, Required otherwise. Specify \"LICENSE\" or \"LICENSE.md\" to link to a file of that name inside the repo, or a URL to a remote file.\ntags:\n- {tag_0}                                               # Optional. Example: audio\n- {tag_1}                                               # Optional. Example: automatic-speech-recognition\nowners:\n- organization: {organization_name}                     # Required. Example: BZK\n  oin: {oin}                                            # Optional. Example: 00000001003214345000\n  name: {owner_name}                                    # Optional. Example: John Doe\n  email: {owner_email}                                  # Optional. Example: johndoe@email.com\n  role: {owner_role}                                    # Optional. Example: Data Scientist.\n\nmodel-index:\n- name: {model_id}                                      # Required. Example: CatClassifier.\n  model: {model_uri}                                    # Required. URI to a repository containing the model file.\n  artifacts:\n  - uri: {model_artifact_uri}                           # Optional. Example: \"https://github.com/MinBZK/poc-kijkdoos-wasm-models/raw/main/logres_iris/logreg_iris.onnx\"\n  - content-type: {model_artifact_type}                 # Optional. Example: \"application/onnx\".\n  - md5-checksum: {md5_checksum}                        # Optional. Example: \"120EA8A25E5D487BF68B5F7096440019\"\n  parameters:\n  - name: {parameter_name}                              # Optional. Example: \"epochs\".\n    dtype: {parameter_dtype}                            # Optional. Example: \"int\".\n    value: {parameter_value}                            # Optional. Example: 100.\n    labels:\n      - name: {label_name}                              # Optional. Example: \"gender\".\n        dtype: {label_type}                             # Optional. Example: \"string\".\n        value: {label_value}                            # Optional. Example: \"female\".\n  results:\n  - task:\n      type: {task_type}                                 # Required. Example: image-classification.\n      name: {task_name}                                 # Optional. Example: Image Classification.\n    datasets:\n      - type: {dataset_type}                            # Required. Example: common_voice. Link to a repository containing the dataset\n        name: {dataset_name}                            # Required. Example: \"Common Voice (French)\". A pretty name for the dataset.\n        split: {split}                                  # Optional. Example: \"train\".\n        features:\n         - {feature_name}                               # Optional. Example: \"gender\".\n        revision: {dataset_version}                     # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n    metrics:\n    - type: {metric_type}                               # Required. Example: false-positive-rate. Use metric id from https://hf.co/metrics.\n      name: {metric_name}                               # Required. Example: \"FPR wrt class 0 restricted to feature gender:0 and age:21\".\n      dtype: {metric_dtype}                             # Required. Example: \"float\".\n      value: {metric_value}                             # Required. Example: 0.75.\n      labels:\n        - name: {label_name}                            # Optional. Example: \"gender\".\n          type: {label_type}                            # Optional. Example: \"feature\".\n          dtype: {label_type}                           # Optional. Example: \"string\".\n          value: {label_value}                          # Optional. Example: \"female\".\n    measurements:\n      # Bar plots should be able to capture SHAP and Robustness Toolbox from AI Verify.\n      bar_plots:\n      - type: {measurement_type}                        # Required. Example: \"SHAP\".\n        name: {measurement_name}                        # Optional. Example: \"Mean Absolute Shap Values\".\n        results:\n        - name: {bar_name}                              # Required. The name of a bar.\n          value: {bar_value}                            # Required. The corresponding value.\n      # Graph plots should be able to capture graph based measurements such as partial dependence and accumulated local effect.\n      graph_plots:\n      - type: {measurement_type}                        # Required. Example: \"partial_dependence\".\n        name: {measurement_name}                        # Optional. Example: \"Partial Dependence Plot\".\n        # Results store the graph plot data. So far all plots are dependent on a combination of a specific class (sometimes) and feature (always).\n        # For example partial dependence plots are made for each feature and class.\n        results:\n         - class: {class_name}                          # Optional. Name of the output class the graph depends on.\n           feature: {feature_name}                      # Required. Name of the feature the graph depends on.\n           data:\n            - x_value: {x_value}                        # Required. The x value of the graph data.\n              y_value: {y_value}                        # Required. The y value of the graph data.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a4/#assessment-card","title":"Assessment Card","text":"<pre><code>provenance:                                             # Optional.\n  git_commit_hash: {git_commit_hash}                    # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n  timestamp: {modification_timestamp}                   # Optional. Example: 2024-04-16T16:48:14Z.\n  uri: {modification_uri}                               # Optional. Example: https://github.com/MinBZK/tad-conversion-tool\n  author: {modification_author}                         # Optional. Example: John Doe\nname: {assessment_name}                                 # Required. Example: IAMA.\ndate: {assessment_date}                                 # Required. Example: 25-03-2025.\ncontents:\n  - question: {question_text}                           # Required. Example: \"Question 1: ...\".\n    answer: {answer_text}                               # Required. Example: \"Answer: ...\".\n    remarks: {remarks_text}                             # Optional. Example: \"Remarks: ...\".\n    authors:                                            # Optional. Example: \"['John', 'Peter']\".\n      - name: {author_name}\n    timestamp: {timestamp}                              # Optional. Example: 2024-04-16T16:48:14Z.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a4/#schema","title":"Schema","text":"<p>JSON schema will be added when we publish the first beta version.</p>"},{"location":"projects/tad/reporting-standard/0.1a4/#changelog","title":"Changelog","text":"<ul> <li>0.1a4: adds data provenance</li> <li>0.1a3: require ISO 8601 timestamp</li> <li>0.1a2: introduces typed artifacts</li> <li>0.1a1: initial draft version of this reporting standard</li> </ul> <ol> <li> <p>Deviation from the Hugging Face specification is in the License field. Hugging Face only accepts dataset id's from Hugging Face license list while we accept any license from Open Source License List.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the <code>model_index:results:dataset</code> field. Hugging Face only accepts one dataset, while we accept a list of datasets.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the Dataset Type field. Hugging Face only accepts dataset id's from Hugging Face datasets while we also allow for any url pointing to the dataset.\u00a0\u21a9\u21a9</p> </li> <li> <p>For this extension to work relevant metrics (such as for example false positive rate) have to be added to the Hugging Face metrics, possibly this can be done in our organizational namespace.\u00a0\u21a9\u21a9</p> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a5/","title":"0.1a5","text":"<p>This document describes the Transparency of Algorithmic Decision making (TAD) Reporting Standard.</p> <p>For reproducibility, governance, auditing and sharing of algorithmic systems it is essential to have a reporting standard so that information about an algorithmic system can be shared. This reporting standard describes how information about the different phases of an algorithm's life cycle can be reported. It contains, among other things, descriptive information combined with information about the technical tests and assessments applied.</p> <p>Disclaimer</p> <p>The TAD Reporting Standard is work in progress. This means that the current standard is probably suboptimal and will change significantly in future versions.</p>"},{"location":"projects/tad/reporting-standard/0.1a5/#introduction","title":"Introduction","text":"<p>Inspired by Model Cards for Model Reporting and Papers with Code Model Index this standard almost <sup>1</sup> <sup>2</sup> <sup>3</sup> <sup>4</sup> extends the Hugging Face model card metadata specification to allow for:</p> <ol> <li>More fine-grained information on performance metrics, by extending the <code>metrics_field</code> from the Hugging Face metadata specification.</li> <li>Capturing additional measurements on fairness and bias, which can be partitioned into bar plot like measurements (such as mean absolute SHAP values) and graph plot like measurements (such as partial dependence). This is achieved by defining a new field <code>measurements</code>.</li> <li>Capturing assessments (such as IAMA and ALTAI). This is achieved by defining a new field <code>assessments</code>.</li> </ol> <p>Following Hugging Face, this proposed standard will be written in yaml.</p> <p>This standard does not contain all fields present in the Hugging Face metadata specification. The fields that are optional in the Hugging Face specification and are specific to the Hugging Face interface are omitted.</p> <p>Another difference is that we divide our implementation into three separate parts.</p> <ol> <li><code>system_card</code>, containing information about a group of ML-models which accomplish a specific task.</li> <li><code>model_card</code>, containing information about a specific data science model.</li> <li><code>assessment_card</code>, containing information about a regulatory assessment.</li> </ol> <p>Include statements</p> <p>These <code>model_card</code>s and  <code>assessment_card</code>s  can be included verbatim into a <code>system_card</code>, or referenced with an <code>!include</code> statement, allowing for minimal cards to be compact in a single file. Extensive cards can be split up for readability and maintainability. Our standard allows for the <code>!include</code> to be used anywhere.</p>"},{"location":"projects/tad/reporting-standard/0.1a5/#specification-of-the-standard","title":"Specification of the standard","text":"<p>The standard will be written in yaml. Example yaml files are given in the next section. The standard defines three cards: a <code>system_card</code>, a <code>model_card</code> and an <code>assessment_card</code>. A <code>system_card</code> contains information about an algorithmic system. It can have multiple models and each of these models should have a <code>model_card</code>. Regulatory assessments can be processed in an <code>assessment_card</code>. Note that <code>model_card</code>'s and <code>assessment_card</code>'s can be included directly into the <code>system_card</code> or can be included as separate yaml files with help of a yaml-include mechanism. For clarity the latter is preferred and is also used in the examples in the next section.</p>"},{"location":"projects/tad/reporting-standard/0.1a5/#system_card","title":"<code>system_card</code>","text":"<p>A <code>system_card</code> contains the following information.</p> <ol> <li><code>schema_version</code> (REQUIRED, string). Version of the schema used, for example \"0.1a2\".</li> <li> <p><code>provenance</code> (OPTIONAL). In case this System Card is generated from another source file, this field can capture the historical context of the contents of this System Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card. Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>name</code> (OPTIONAL, string). Name used to describe the system.</p> </li> <li><code>upl</code> (OPTIONAL, string). If this algorithm is part of a product offered by the Dutch Government,     it should contain a URI from the Uniform Product List.</li> <li> <p><code>owners</code> (list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is set.</li> </ol> </li> <li> <p><code>description</code> (OPTIONAL, string). A short description of the system.</p> </li> <li> <p><code>labels</code> (OPTIONAL, list). This fields allows to store meta information about a system. There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the label.</li> <li><code>value</code> (OPTIONAL, string). Value of the label.</li> </ol> </li> <li> <p><code>status</code> (OPTIONAL, string). The status of the system. For example the status can be \"production\".</p> </li> <li><code>publication_category</code> (OPTIONAL, enum[string]). The publication category of the algorithm should be chosen from <code>[\"high_risk\", other\"]</code>.</li> <li><code>begin_date</code> (OPTIONAL, string). The first date the system was used. Date should be given in ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>end_date</code> (OPTIONAL, string). The last date the system was used. Date should be given in ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>goal_and_impact</code> (OPTIONAL, string). The purpose of the system and the impact it has on citizens and companies.</li> <li><code>considerations</code> (OPTIONAL, string). The pro's and con's of using the system.</li> <li><code>risk_management</code> (OPTIONAL, string). Description of the risks associated with the system.</li> <li><code>human_intervention</code> (OPTIONAL, string). A description to want extend there is human involvement in the system.</li> <li><code>legal_base</code> (OPTIONAL, list). If there exists a legal base for the process the system is embedded in, this field can be filled in with the relevant laws. There can be multiple legal bases. For each legal base the following fields are present.<ol> <li><code>name</code> (OPTIONAL, string). Name of the law.</li> <li><code>link</code> (OPTIONAL, string). URI pointing towards the contents of the law.</li> </ol> </li> <li><code>used_data</code> (OPTIONAL, string). An overview of the data that is used in the system.</li> <li><code>technical_design</code> (OPTIONAL, string). Description on how the system works.</li> <li><code>external_providers</code> (OPTIONAL, list). If relevant, these fields allow to store information on external providers.  There can be multiple external providers.<ol> <li><code>name</code> (OPTIONAL, string). Name of the external provider.</li> <li><code>version</code> (OPTIONAL, string). Version of the external provider reflecting its relation to previous versions.</li> </ol> </li> <li><code>references</code> (OPTIONAL, list[string]). Additional reference URI's that point information about the system and are relevant.</li> <li><code>interaction_details</code> (OPTIONAL, list[string]). Explain how the AI system interacts with hardware or software, including other AI systems, or how the AI system can be used to interact with hardware or software.</li> <li><code>version_requirements</code> (OPTIONAL, list[string]). Describe the versions of the relevant software or firmware,  and any requirements related to version updates.</li> <li><code>deployment_variants</code> (OPTIONAL, list[string]). Description of all the forms in which the AI system is placed on the market or put into service, such as software packages embedded into hardware, downloads, or APIs.</li> <li><code>hardware_requirements</code> (OPTIONAL, list[string]). Provide a description of the hardware on which the AI system must be run.</li> <li><code>product_markings</code> (OPTIONAL, list[string]). If the AI system is a component of products, photos, or illustrations, describe the external features, markings, and internal layout of those products.</li> <li><code>user_interface</code> (OPTIONAL, list). Provide information on the user interface provided to the user responsible for its operation.<ol> <li><code>description</code> (OPTIONAL, string). A description of the provided user interface.</li> <li><code>link</code> (OPTIONAL, string). A link to the user interface can be included.</li> <li><code>snapshot</code> (OPTIONAL, string). A snapshot/screenshot of the user interface can be included with the use of a hyperlink.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a5/#1-models","title":"1. Models","text":"<ol> <li><code>models</code> (OPTIONAL, list[ModelCard]). A list of model cards (as defined below) or <code>!include</code>s of a yaml file containing a model card. This model card can for example be a model card described in the next section or a model card from Hugging Face. There can be multiple model cards, meaning multiple models are used.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a5/#2-assessments","title":"2. Assessments","text":"<ol> <li><code>assessments</code> (OPTIONAL, list[AssessmentCard]). A list of assessment cards (as defined below) or <code>!include</code>s of a yaml file containing a assessment card. This assessment card is an assessment card described in the next section. There can be multiple assessment cards, meaning multiple assessment were performed.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a5/#model_card","title":"<code>model_card</code>","text":"<p>A <code>model_card</code> contains the following information.</p> <ol> <li> <p><code>provenance</code> (OPTIONAL). In case this Model Card is generated from another source file, this field can capture the historical context of the contents of this Model Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card. Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>language</code> (OPTIONAL, list[string]). If relevant, the natural languages the model supports in ISO 639.     There can be multiple languages.</p> </li> <li> <p><code>license</code>(REQUIRED, string). Any license from the open source license list <sup>1</sup>. If the license is NOT present in the license list this field must be set to 'other' and the following two fields will be REQUIRED.</p> <ol> <li><code>license_name</code> (string). An id for the license.</li> <li><code>license_link</code> (string). A link to a file of that name inside the repo, or a URL to a remote file containing the license contents.</li> </ol> </li> <li> <p><code>tags</code> (OPTIONAL, list[string]). Tags with keywords to describe the project. There can be multiple tags.</p> </li> <li> <p><code>owners</code> (list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is set.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a5/#1-model-index","title":"1. Model Index","text":"<p>There can be multiple models. For each model the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the model.</li> <li><code>model</code> (REQUIRED, string). A URI pointing to a repository containing the model file.</li> <li> <p><code>artifacts</code> (OPTIONAL, list). A list of artifacts</p> <ol> <li><code>uri</code> (OPTIONAL, string) URI refers to a relevant model artifact</li> <li><code>content-type</code> (OPTIONAL, string) Optional type, follow the Content-Type. Recognized values are \"application/onnx\"\", to refer to an ONNX representation of the model.</li> <li><code>md5-checksum</code> (OPTIONAL, string) Optional checksum for the content of the file.</li> </ol> </li> <li> <p><code>parameters</code> (list). There can be multiple parameters. For each parameter the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the parameter, for example \"epochs\".</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the parameter, for example \"int\".</li> <li><code>value</code> (OPTIONAL, string). The value of the parameter, for example 100.</li> <li> <p><code>labels</code> (list). This field allows to store meta information about a parameter.     There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the label.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED.</li> </ol> </li> </ol> </li> <li> <p><code>results</code> (list). There can be multiple results. For each result the following fields are present.</p> <ol> <li> <p><code>task</code> (OPTIONAL, list).</p> <ol> <li><code>task_type</code> (REQUIRED, string). The task of the model, for example \"object-classification\".</li> <li><code>task_name</code> (OPTIONAL, string). A pretty name for the model tasks, for example \"Object Classification\".</li> </ol> </li> <li> <p><code>datasets</code> (list). There can be multiple datasets <sup>2</sup>. For each dataset the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the dataset, can be a dataset id from Hugging Face datasets or any other link to a repository containing the dataset<sup>3</sup>, for example \"common_voice\".</li> <li><code>name</code> (REQUIRED, string). Name pretty name for the dataset, for example \"Common Voice (French)\".</li> <li><code>split</code> (OPTIONAL, string). The split of the dataset, for example \"train\".</li> <li><code>features</code> (OPTIONAL, list[string]). List of feature names.</li> <li><code>revision</code> (OPTIONAL, string). Version of the dataset, for example 5503434ddd753f426f4b38109466949a1217c2bb.</li> </ol> </li> <li> <p><code>metrics</code> (list). There can be multiple metrics. For each metric the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). A metric-id from Hugging Face metrics<sup>4</sup>, for example accuracy.</li> <li><code>name</code> (REQUIRED, string). A descriptive name of the metric. For example \"false positive rate\" is not a descriptive name, but \"training false positive rate w.r.t class x\" is.</li> <li><code>dtype</code> (REQUIRED, string). The data type of the metric, for example <code>float</code>.</li> <li><code>value</code> (REQUIRED, string). The value of the metric.</li> <li> <p><code>labels</code> (list). This field allows to store meta information about a metric. For example, metrics can be computed for example on subgroups of specific features. For example, one can compute the accuracy for examples where the feature \"gender\" is set to \"male\". There can be multiple subgroups, which means that the metric is computed on the intersection of those subgroups. There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the feature. For example: \"gender\".</li> <li><code>type</code> (OPTIONAL, string). The type of the label. Can for example be set to \"feature\" or \"output_class\". If <code>name</code> is set, this field is REQUIRED.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature, for example <code>float</code>. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED. For example: \"male\".</li> </ol> </li> </ol> </li> <li> <p><code>measurements</code>.</p> <ol> <li> <p><code>bar_plots</code> (list). The purpose of this field is to capture bar plot like measurements, for example SHAP values. There can be multiple bar plots. For each bar plot the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of bar plot, for example \"SHAP\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name for the plot, for example \"Mean Absolute SHAP Values\".</li> <li><code>results</code> (list). The contents of the bar plot. A result represents a bar. There can be multiple results. For each result the following fields are present.<ol> <li><code>name</code> (REQUIRED, string). The name of bar.</li> <li><code>value</code> (REQUIRED, float). The value of the corresponding bar.</li> </ol> </li> </ol> </li> <li> <p><code>graph_plots</code> (list). The purpose of this field is to capture graph plot like measurements, such as partial dependence plots. There can be multiple graph plots. For each graph plot the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the graph plot, for example \"partial_dependence\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name of the graph, for example \"Partial Dependence Plot\".</li> <li><code>results</code> (list). Results contains the graph plot data. Each graph can depend on a specific output class and feature. There can be multiple results. For each result the following fields are present.<ol> <li><code>class</code> (OPTIONAL, string/int/float/bool). The output class name that the graph corresponds to. This field is not always present.</li> <li><code>feature</code> (REQUIRED, string). The feature the graph corresponds to. This is required, since all relevant graphs are dependent on features.</li> <li><code>data</code> (list)<ol> <li><code>x_value</code> (REQUIRED, float). The $x$-value of the graph.</li> <li><code>y_value</code> (REQUIRED, float). The $y$-value of the graph.</li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a5/#assessment_card","title":"<code>assessment_card</code>","text":"<p>An <code>assessment_card</code> contains the following information.</p> <ol> <li> <p><code>provenance</code> (OPTIONAL). In case this Assessment Card is generated from another source file, this field can capture the historical context of the contents of this Assessment Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card. Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>name</code> (REQUIRED, string). The name of the assessment.</p> </li> <li><code>date</code> (REQUIRED, string). The date at which the assessment is completed. Date should be given in ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li> <p><code>contents</code> (list). There can be multiple items in contents. For each item the following fields are present:</p> <ol> <li><code>question</code> (REQUIRED, string). A question.</li> <li><code>answer</code> (REQUIRED, string). An answer.</li> <li><code>remarks</code> (OPTIONAL, string). A field to put relevant discussion remarks in.</li> <li><code>authors</code>. There can be multiple names. For each name the following field is present.<ol> <li><code>name</code> (OPTIONAL, string). The name of the author of the question.</li> </ol> </li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of the answer. Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a5/#example","title":"Example","text":""},{"location":"projects/tad/reporting-standard/0.1a5/#system-card","title":"System Card","text":"<pre><code>version: {system_card_version}                          # Optional. Example: \"0.1a1\"\nprovenance:                                             # Optional.\n  git_commit_hash: {git_commit_hash}                    # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n  timestamp: {modification_timestamp}                   # Optional. Example: 2024-04-16T16:48:14Z.\n  uri: {modification_uri}                               # Optional. Example: https://github.com/MinBZK/tad-conversion-tool\n  author: {modification_author}                         # Optional. Example: John Doe\nname: {system_name}                                     # Optional. Example: \"AangifteVertrekBuitenland\"\nupl: {upl_uri}                                          # Optional. Example: https://standaarden.overheid.nl/owms/terms/AangifteVertrekBuitenland\nowners:\n- oin: {oin}                                            # Optional. Example: 00000001003214345000\n  organization: {organization_name}                     # Optional if oin is provided, Required otherwise. Example: BZK\n  name: {owner_name}                                    # Optional. Example: John Doe\n  email: {owner_email}                                  # Optional. Example: johndoe@email.com\n  role: {owner_role}                                    # Optional. Example: Data Scientist.\ndescription: {system_description}                       # Optional. Short description of the system.\nlabels:                                                 # Optional. Labels to store metadata about the system.\n- name: {label_name}                                    # Optional.\n  value: {label_value}                                  # Optional.\nstatus: {system_status}                                 # Optional. Example: \"production\".\npublication_category: {system_publication_cat}          # Optional. Example: \"impactful_algorithm\".\nbegin_date: {system_begin_date}                         # Optional. Example: 2025-1-1.\nend_date: {system_end_date}                             # Optional. Example: 2025-12-1.\ngoal_and_impact: {system_goal_and_impact}               # Optional. Goal and impact of the system.\nconsiderations: {system_considerations}                 # Optional. Considerations about the system.\nrisk_management: {system_risk_management}               # Optional. Description of risks associated with the system.\nhuman_intervention: {system_human_intervention}         # Optional. Description of human involvement in the system.\nlegal_base:\n- name: {law_name}                                      # Optional. Example: \"AVG\".\n  link: {law_uri}                                       # Optional. Example: \"https://eur-lex.europa.eu/legal-content/NL/TXT/HTML/?uri=CELEX:31995L0046\".\nused_data: {system_used_data}                           # Optional. Description of the data used by the system.\ntechnical_design: {technical_design}                    # Optional. Description of the technical design of the system.\nexternal_providers:\n- name: {name_external_provider}                        # Optional. Reference to used external providers.\n  version: {version_external_provider}                  # Optional. Version used of the external provider.\nreferences:\n- {reference_uri}                                       # Optional. Example: URI to codebase.\ninteraction_details:\n- {system_interaction_details}                          # Optional. Example: \"GPS modules for location tracking\"\nversion_requirements:\n- {system_version_requirements}                         # Optional. Example: \"&gt;version2.1\"\ndeployment_variants:\n- {system_deployment_variants}                          # Optional. Example: \"Web Application\"\nhardware_requirements:\n- {system_hardware_requirements}                        # Optional. Example: \"8 cores, 16 threads CPU\"\nproduct_markings:\n- {system_product_markings}                             # Optional. Example: \"Model number in the info menu\"\nuser_interface:\n- description: {system_user_interface}                  # Optional. Example: \"web-based dashboard\"\n  link: {system_user_interface_uri}                     # Optional. Example: \"http://example.com/content\"\n  snapshot: {system_user_interface_snapshot_uri}        # Optional. Example: \"http://example.com/snapshot.png\"\n\nmodels:\n- !include {model_card_uri}                             # Optional. Example: cat_classifier_model.yaml.\n\nassessments:\n- !include {assessment_card_uri}                        # Required. Example: iama.yaml.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a5/#model-card","title":"Model Card","text":"<pre><code>provenance:                                             # Optional.\n  git_commit_hash: {git_commit_hash}                    # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n  timestamp: {modification_timestamp}                   # Optional. Example: 2024-04-16T16:48:14Z.\n  uri: {modification_uri}                               # Optional. Example: https://github.com/MinBZK/tad-conversion-tool\n  author: {modification_author}                         # Optional. Example: John Doe\nlanguage:\n  - {lang_0}                                            # Optional. Example nl.\nlicense: {license}                                      # Required. Example: Apache-2.0 or any license SPDX ID from https://opensource.org/license or \"other\".\nlicense_name: {license_name}                            # Optional if license != other, Required otherwise. Example: 'my-license-1.0'\nlicense_link: {license_link}                            # Optional if license != other, Required otherwise. Specify \"LICENSE\" or \"LICENSE.md\" to link to a file of that name inside the repo, or a URL to a remote file.\ntags:\n- {tag_0}                                               # Optional. Example: audio\n- {tag_1}                                               # Optional. Example: automatic-speech-recognition\nowners:\n- organization: {organization_name}                     # Required. Example: BZK\n  oin: {oin}                                            # Optional. Example: 00000001003214345000\n  name: {owner_name}                                    # Optional. Example: John Doe\n  email: {owner_email}                                  # Optional. Example: johndoe@email.com\n  role: {owner_role}                                    # Optional. Example: Data Scientist.\n\nmodel-index:\n- name: {model_id}                                      # Required. Example: CatClassifier.\n  model: {model_uri}                                    # Required. URI to a repository containing the model file.\n  artifacts:\n  - uri: {model_artifact_uri}                           # Optional. Example: \"https://github.com/MinBZK/poc-kijkdoos-wasm-models/raw/main/logres_iris/logreg_iris.onnx\"\n  - content-type: {model_artifact_type}                 # Optional. Example: \"application/onnx\".\n  - md5-checksum: {md5_checksum}                        # Optional. Example: \"120EA8A25E5D487BF68B5F7096440019\"\n  parameters:\n  - name: {parameter_name}                              # Optional. Example: \"epochs\".\n    dtype: {parameter_dtype}                            # Optional. Example: \"int\".\n    value: {parameter_value}                            # Optional. Example: 100.\n    labels:\n      - name: {label_name}                              # Optional. Example: \"gender\".\n        dtype: {label_type}                             # Optional. Example: \"string\".\n        value: {label_value}                            # Optional. Example: \"female\".\n  results:\n  - task:\n      type: {task_type}                                 # Required. Example: image-classification.\n      name: {task_name}                                 # Optional. Example: Image Classification.\n    datasets:\n      - type: {dataset_type}                            # Required. Example: common_voice. Link to a repository containing the dataset\n        name: {dataset_name}                            # Required. Example: \"Common Voice (French)\". A pretty name for the dataset.\n        split: {split}                                  # Optional. Example: \"train\".\n        features:\n         - {feature_name}                               # Optional. Example: \"gender\".\n        revision: {dataset_version}                     # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n    metrics:\n    - type: {metric_type}                               # Required. Example: false-positive-rate. Use metric id from https://hf.co/metrics.\n      name: {metric_name}                               # Required. Example: \"FPR wrt class 0 restricted to feature gender:0 and age:21\".\n      dtype: {metric_dtype}                             # Required. Example: \"float\".\n      value: {metric_value}                             # Required. Example: 0.75.\n      labels:\n        - name: {label_name}                            # Optional. Example: \"gender\".\n          type: {label_type}                            # Optional. Example: \"feature\".\n          dtype: {label_type}                           # Optional. Example: \"string\".\n          value: {label_value}                          # Optional. Example: \"female\".\n    measurements:\n      # Bar plots should be able to capture SHAP and Robustness Toolbox from AI Verify.\n      bar_plots:\n      - type: {measurement_type}                        # Required. Example: \"SHAP\".\n        name: {measurement_name}                        # Optional. Example: \"Mean Absolute Shap Values\".\n        results:\n        - name: {bar_name}                              # Required. The name of a bar.\n          value: {bar_value}                            # Required. The corresponding value.\n      # Graph plots should be able to capture graph based measurements such as partial dependence and accumulated local effect.\n      graph_plots:\n      - type: {measurement_type}                        # Required. Example: \"partial_dependence\".\n        name: {measurement_name}                        # Optional. Example: \"Partial Dependence Plot\".\n        # Results store the graph plot data. So far all plots are dependent on a combination of a specific class (sometimes) and feature (always).\n        # For example partial dependence plots are made for each feature and class.\n        results:\n         - class: {class_name}                          # Optional. Name of the output class the graph depends on.\n           feature: {feature_name}                      # Required. Name of the feature the graph depends on.\n           data:\n            - x_value: {x_value}                        # Required. The x value of the graph data.\n              y_value: {y_value}                        # Required. The y value of the graph data.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a5/#assessment-card","title":"Assessment Card","text":"<pre><code>provenance:                                             # Optional.\n  git_commit_hash: {git_commit_hash}                    # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n  timestamp: {modification_timestamp}                   # Optional. Example: 2024-04-16T16:48:14Z.\n  uri: {modification_uri}                               # Optional. Example: https://github.com/MinBZK/tad-conversion-tool\n  author: {modification_author}                         # Optional. Example: John Doe\nname: {assessment_name}                                 # Required. Example: IAMA.\ndate: {assessment_date}                                 # Required. Example: 25-03-2025.\ncontents:\n  - question: {question_text}                           # Required. Example: \"Question 1: ...\".\n    answer: {answer_text}                               # Required. Example: \"Answer: ...\".\n    remarks: {remarks_text}                             # Optional. Example: \"Remarks: ...\".\n    authors:                                            # Optional. Example: \"['John', 'Peter']\".\n      - name: {author_name}\n    timestamp: {timestamp}                              # Optional. Example: 2024-04-16T16:48:14Z.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a5/#schema","title":"Schema","text":"<p>JSON schema will be added when we publish the first beta version.</p>"},{"location":"projects/tad/reporting-standard/0.1a5/#changelog","title":"Changelog","text":"<ul> <li>0.1a5: adds a general description of the technical documentation required for high-risk systems to conform to the EU AI Act.</li> <li>0.1a4: adds data provenance</li> <li>0.1a3: require ISO 8601 timestamp</li> <li>0.1a2: introduces typed artifacts</li> <li>0.1a1: initial draft version of this reporting standard</li> </ul> <ol> <li> <p>Deviation from the Hugging Face specification is in the License field. Hugging Face only accepts dataset id's from Hugging Face license list while we accept any license from Open Source License List.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the <code>model_index:results:dataset</code> field. Hugging Face only accepts one dataset, while we accept a list of datasets.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the Dataset Type field. Hugging Face only accepts dataset id's from Hugging Face datasets while we also allow for any url pointing to the dataset.\u00a0\u21a9\u21a9</p> </li> <li> <p>For this extension to work relevant metrics (such as for example false positive rate) have to be added to the Hugging Face metrics, possibly this can be done in our organizational namespace.\u00a0\u21a9\u21a9</p> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a6/","title":"0.1a6","text":"<p>This document describes the Transparency of Algorithmic Decision making (TAD) Reporting Standard.</p> <p>For reproducibility, governance, auditing and sharing of algorithmic systems it is essential to have a reporting standard so that information about an algorithmic system can be shared. This reporting standard describes how information about the different phases of an algorithm's life cycle can be reported. It contains, among other things, descriptive information combined with information about the technical tests and assessments applied.</p> <p>Disclaimer</p> <p>The TAD Reporting Standard is work in progress. This means that the current standard is probably suboptimal and will change significantly in future versions.</p>"},{"location":"projects/tad/reporting-standard/0.1a6/#introduction","title":"Introduction","text":"<p>Inspired by Model Cards for Model Reporting and Papers with Code Model Index this standard almost<sup>1</sup> <sup>2</sup> <sup>3</sup> <sup>4</sup> extends the Hugging Face model card metadata specification to allow for:</p> <ol> <li>More fine-grained information on performance metrics, by extending the <code>metrics_field</code> from the Hugging    Face metadata specification.</li> <li>Capturing additional measurements on fairness and bias, which can be partitioned into bar plot like    measurements (such as mean absolute SHAP values) and graph plot like measurements (such as partial dependence). This    is achieved by defining a new field <code>measurements</code>.</li> <li>Capturing assessments (such as    IAMA    and ALTAI).    This is achieved by defining a new field <code>assessments</code>.</li> </ol> <p>Following Hugging Face, this proposed standard will be written in YAML.</p> <p>This standard does not contain all fields present in the Hugging Face metadata specification. The fields that are optional in the Hugging Face specification and are specific to the Hugging Face interface are omitted.</p> <p>Another difference is that we divide our implementation into three separate parts.</p> <ol> <li><code>system_card</code>, containing information about a group of ML-models which accomplish a specific task.</li> <li><code>model_card</code>, containing information about a specific data science model.</li> <li><code>assessment_card</code>, containing information about a regulatory assessment.</li> </ol> <p>Include statements</p> <p>These <code>model_card</code>s and  <code>assessment_card</code>s  can be included verbatim into a <code>system_card</code>, or referenced with an <code>!include</code> statement, allowing for minimal cards to be compact in a single file. Extensive cards can be split up for readability and maintainability. Our standard allows for the <code>!include</code> to be used anywhere.</p>"},{"location":"projects/tad/reporting-standard/0.1a6/#specification-of-the-standard","title":"Specification of the standard","text":"<p>The standard will be written in YAML. Example YAML files are given in the next section. The standard defines three cards: a <code>system_card</code>, a <code>model_card</code> and an <code>assessment_card</code>. A <code>system_card</code> contains information about an algorithmic system. It can have multiple models and each of these models should have a <code>model_card</code>. Regulatory assessments can be processed in an <code>assessment_card</code>. Note that <code>model_card</code>'s and <code>assessment_card</code>'s can be included directly into the <code>system_card</code> or can be included as separate YAML files with help of a YAML-include mechanism. For clarity the latter is preferred and is also used in the examples in the next section.</p>"},{"location":"projects/tad/reporting-standard/0.1a6/#system_card","title":"<code>system_card</code>","text":"<p>A <code>system_card</code> contains the following information.</p> <ol> <li><code>schema_version</code> (REQUIRED, string). Version of the schema used, for example \"0.1a2\".</li> <li> <p><code>provenance</code> (OPTIONAL). In case this System Card is generated from another source file, this field can capture the    historical context of the contents of this System Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used     to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.     Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in     ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>name</code> (OPTIONAL, string). Name used to describe the system.</p> </li> <li><code>upl</code> (OPTIONAL, string). If this algorithm is part of a product offered by the Dutch Government,     it should contain a URI from the     Uniform Product List.</li> <li> <p><code>owners</code> (OPTIONAL, list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the    Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this    field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is    set.</li> </ol> </li> <li> <p><code>description</code> (OPTIONAL, string). A short description of the system.</p> </li> <li> <p><code>labels</code> (OPTIONAL, list). This fields allows to store meta information about a system. There can be multiple labels.    For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the label.</li> <li><code>value</code> (OPTIONAL, string). Value of the label.</li> </ol> </li> <li> <p><code>status</code> (OPTIONAL, string). The status of the system. For example the status can be \"production\".</p> </li> <li><code>publication_category</code> (OPTIONAL, enum[string]). The publication category of the algorithm should be chosen from    <code>[\"high_risk\", other\"]</code>.</li> <li><code>begin_date</code> (OPTIONAL, string). The first date the system was used. Date should be given in     ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>end_date</code> (OPTIONAL, string). The last date the system was used. Date should be given in     ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>goal_and_impact</code> (OPTIONAL, string). The purpose of the system and the impact it has on citizens and companies.</li> <li><code>considerations</code> (OPTIONAL, string). The pro's and con's of using the system.</li> <li><code>risk_management</code> (OPTIONAL, string). Description of the risks associated with the system.</li> <li><code>human_intervention</code> (OPTIONAL, string). A description to want extend there is human involvement in the system.</li> <li> <p><code>legal_base</code> (OPTIONAL, list). If there exists a legal base for the process the system is embedded in, this field     can be filled in with the relevant laws. There can be multiple legal bases. For each legal base the following fields     are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the law.</li> <li><code>link</code> (OPTIONAL, string). URI pointing towards the contents of the law.</li> </ol> </li> <li> <p><code>used_data</code> (OPTIONAL, string). An overview of the data that is used in the system.</p> </li> <li><code>technical_design</code> (OPTIONAL, string). Description on how the system works.</li> <li> <p><code>external_providers</code> (OPTIONAL, list). If relevant, these fields allow to store information on external providers.     There can be multiple external providers.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the external provider.</li> <li><code>version</code> (OPTIONAL, string). Version of the external provider reflecting its relation to previous versions.</li> </ol> </li> <li> <p><code>references</code> (OPTIONAL, list[string]). Additional reference URI's that point information about the system and are     relevant.</p> </li> <li><code>interaction_details</code> (OPTIONAL, list[string]). Explain how the AI system interacts with hardware or software,     including other AI systems, or how the AI system can be used to interact with hardware or software.</li> <li><code>version_requirements</code> (OPTIONAL, list[string]). Describe the versions of the relevant software or firmware, and any     requirements related to version updates.</li> <li><code>deployment_variants</code> (OPTIONAL, list[string]). Description of all the forms in which the AI system is placed on the     market or put into service, such as software packages embedded into hardware, downloads, or APIs.</li> <li><code>hardware_requirements</code> (OPTIONAL, list[string]). Provide a description of the hardware on which the AI system must     be run.</li> <li><code>product_markings</code> (OPTIONAL, list[string]). If the AI system is a component of products, photos, or illustrations,     describe the external features, markings, and internal layout of those products.</li> <li> <p><code>user_interface</code> (OPTIONAL, list). Provide information on the user interface provided to the user responsible for     its operation.</p> <ol> <li><code>description</code> (OPTIONAL, string). A description of the provided user interface.</li> <li><code>link</code> (OPTIONAL, string). A link to the user interface can be included.</li> <li><code>snapshot</code> (OPTIONAL, string). A snapshot/screenshot of the user interface can be included with the use of a     hyperlink.</li> </ol> </li> <li> <p><code>models</code> (OPTIONAL, list[ModelCard]). A list of model cards (as defined below) or <code>!include</code>s of a YAML file     containing a model card. This model card can for example be a model card described in the next section or a model     card from Hugging Face. There can be multiple model cards, meaning multiple models are used.</p> </li> <li> <p><code>assessments</code> (OPTIONAL, list[AssessmentCard]). A list of assessment cards (as defined below) or <code>!include</code>s of a     YAML file containing a assessment card. This assessment card is an assessment card described in the next section.     There can be multiple assessment cards, meaning multiple assessment were performed.</p> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a6/#model_card","title":"<code>model_card</code>","text":"<p>A <code>model_card</code> contains the following information.</p> <ol> <li> <p><code>provenance</code> (OPTIONAL). In case this Model Card is generated from another source file, this field can capture the    historical context of the contents of this Model Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used     to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.    Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in    ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>language</code> (OPTIONAL, list[string]). If relevant, the natural languages the model supports in    ISO 639. There can be multiple languages.</p> </li> <li> <p><code>license</code> (REQUIRED).</p> <ol> <li><code>license_name</code> (REQUIRED, string). Any license from the    open source license list<sup>1</sup>. If the license is NOT present in the license list    this field must be set to 'other' and the following two fields will be REQUIRED.</li> <li><code>license_link</code> (OPTIONAL, string). A link to a file of that name inside the repo, or a URL to a remote file    containing the license contents.</li> </ol> </li> <li> <p><code>tags</code> (OPTIONAL, list[string]). Tags with keywords to describe the project. There can be multiple tags.</p> </li> <li> <p><code>owners</code> (OPTIONAL, list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the    Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this    field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is     set.</li> </ol> </li> <li> <p><code>model_index</code> (REQUIRED, list). There can be multiple models. For each model the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the model.</li> <li><code>model</code> (REQUIRED, string). A URI pointing to a repository containing the model file.</li> <li> <p><code>artifacts</code> (OPTIONAL, list). A list of artifacts</p> <ol> <li><code>uri</code> (OPTIONAL, string) URI refers to a relevant model artifact</li> <li><code>content-type</code> (OPTIONAL, string) Optional type, follow the    Content-Type. Recognized values are    \"application/onnx\"\", to refer to an ONNX representation of the model.</li> <li><code>md5-checksum</code> (OPTIONAL, string) Optional checksum for the content of the file.</li> </ol> </li> <li> <p><code>parameters</code> (OPTIONAL, list). There can be multiple parameters. For each parameter the following fields are    present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the parameter, for example \"epochs\".</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the parameter, for example \"int\".</li> <li><code>value</code> (OPTIONAL, string). The value of the parameter, for example 100.</li> <li> <p><code>labels</code> (OPTIONAL, list). This field allows to store meta information about a parameter. There can be    multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the label.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED.</li> </ol> </li> </ol> </li> <li> <p><code>results</code> (OPTIONAL, list). There can be multiple results. For each result the following fields are present.</p> <ol> <li> <p><code>task</code> (OPTIONAL, list).</p> <ol> <li><code>task_type</code> (REQUIRED, string). The task of the model, for example \"object-classification\".</li> <li><code>task_name</code> (OPTIONAL, string). A pretty name for the model tasks, for example \"Object Classification\".</li> </ol> </li> <li> <p><code>datasets</code> (OPTIONAL, list). There can be multiple datasets <sup>2</sup>. For each dataset the following fields are    present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the dataset, can be a dataset id from    Hugging Face datasets or any other link to a repository containing the    dataset<sup>3</sup>, for example \"common_voice\".</li> <li><code>name</code> (REQUIRED, string). Name pretty name for the dataset, for example \"Common Voice (French)\".</li> <li><code>split</code> (OPTIONAL, string). The split of the dataset, for example \"train\".</li> <li><code>features</code> (OPTIONAL, list[string]). List of feature names.</li> <li><code>revision</code> (OPTIONAL, string). Version of the dataset, for example    \"5503434ddd753f426f4b38109466949a1217c2bb\".</li> </ol> </li> <li> <p><code>metrics</code> (OPTIONAL, list). There can be multiple metrics. For each metric the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). A metric-id from Hugging Face metrics<sup>4</sup>, for    example accuracy.</li> <li><code>name</code> (REQUIRED, string). A descriptive name of the metric. For example \"false positive rate\" is not a    descriptive name, but \"training false positive rate w.r.t class x\" is.</li> <li><code>dtype</code> (REQUIRED, string). The data type of the metric, for example <code>float</code>.</li> <li><code>value</code> (REQUIRED, string). The value of the metric.</li> <li> <p><code>labels</code> (OPTIONAL, list). This field allows to store meta information about a metric. For example,    metrics can be computed for example on subgroups of specific features. For example, one can compute the    accuracy for examples where the feature \"gender\" is set to \"male\". There can be multiple subgroups, which    means that the metric is computed on the intersection of those subgroups. There can be multiple labels.    For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the feature. For example: \"gender\".</li> <li><code>type</code> (OPTIONAL, string). The type of the label. Can for example be set to \"feature\" or    \"output_class\". If <code>name</code> is set, this field is REQUIRED.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature, for example <code>float</code>. If <code>name</code> is set, this    field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED. For    example: \"male\".</li> </ol> </li> </ol> </li> <li> <p><code>measurements</code>.</p> <ol> <li> <p><code>bar_plots</code> (OPTIONAL, list). The purpose of this field is to capture bar plot like measurements, for     example SHAP values. There can be multiple bar plots. For each bar plot the following fields are     present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of bar plot, for example \"SHAP\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name for the plot, for example \"Mean Absolute SHAP Values\".</li> <li> <p><code>results</code> (REQUIRED, list). The contents of the bar plot. A result represents a bar. There can be    multiple results. For each result the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of bar.</li> <li><code>value</code> (REQUIRED, float). The value of the corresponding bar.</li> </ol> </li> </ol> </li> <li> <p><code>graph_plots</code> (OPTIONAL, list). The purpose of this field is to capture graph plot like measurements,    such as partial dependence plots. There can be multiple graph plots. For each graph plot the following    fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the graph plot, for example \"partial_dependence\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name of the graph, for example \"Partial Dependence Plot\".</li> <li> <p><code>results</code> (REQUIRED, list). Results contains the graph plot data. Each graph can depend on a specific    output class and feature. There can be multiple results. For each result the following fields are    present.</p> <ol> <li><code>class</code> (OPTIONAL, string/int/float/bool). The output class name that the graph corresponds to.    This field is not always present.</li> <li><code>feature</code> (REQUIRED, string). The feature the graph corresponds to. This is required, since all    relevant graphs are dependent on features.</li> <li> <p><code>data</code> (REQUIRED, list)</p> <ol> <li><code>x_value</code> (REQUIRED, float). The $x$-value of the graph.</li> <li><code>y_value</code> (REQUIRED, float). The $y$-value of the graph.</li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a6/#assessment_card","title":"<code>assessment_card</code>","text":"<p>An <code>assessment_card</code> contains the following information.</p> <ol> <li> <p><code>provenance</code> (OPTIONAL). In case this Assessment Card is generated from another source file, this field can capture    the historical context of the contents of this Assessment Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used    to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.    Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in    ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>name</code> (REQUIRED, string). The name of the assessment.</p> </li> <li><code>date</code> (REQUIRED, string). The date at which the assessment is completed. Date should be given in    ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li> <p><code>contents</code> (REQUIRED, list). There can be multiple items in contents. For each item the following fields are present:</p> <ol> <li><code>question</code> (REQUIRED, string). A question.</li> <li><code>answer</code> (REQUIRED, string). An answer.</li> <li><code>remarks</code> (OPTIONAL, string). A field to put relevant discussion remarks in.</li> <li> <p><code>authors</code> (OPTIONAL, list). There can be multiple names. For each name the following field is present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the author of the question.</li> </ol> </li> <li> <p><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of the answer. Timestamp should be     given, preferably in UTC (represented as <code>Z</code>), in     ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</p> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a6/#example","title":"Example","text":""},{"location":"projects/tad/reporting-standard/0.1a6/#system-card","title":"System Card","text":"<pre><code>version: {system_card_version}\nprovenance:\n  git_commit_hash: {git_commit_hash}\n  timestamp: {modification_timestamp}\n  uri: {modification_uri}\n  author: {modification_author}\nname: {system_name}\nupl: {upl_uri}\nowners:\n  - oin: {oin}\n    organization: {organization_name}\n    name: {owner_name}\n    email: {owner_email}\n    role: {owner_role}\ndescription: {system_description}\nlabels:\n  - name: {label_name}\n    value: {label_value}\nstatus: {system_status}\npublication_category: {system_publication_cat}\nbegin_date: {system_begin_date}\nend_date: {system_end_date}\ngoal_and_impact: {system_goal_and_impact}\nconsiderations: {system_considerations}\nrisk_management: {system_risk_management}\nhuman_intervention: {system_human_intervention}\nlegal_base:\n  - name: {law_name}\n    link: {law_uri}\nused_data: {system_used_data}\ntechnical_design: {technical_design}\nexternal_providers:\n  - name: {name_external_provider}\n    version: {version_external_provider}\nreferences:\n  - {reference_uri}\ninteraction_details:\n  - {system_interaction_details}\nversion_requirements:\n  - {system_version_requirements}\ndeployment_variants:\n  - {system_deployment_variants}\nhardware_requirements:\n  - {system_hardware_requirements}\nproduct_markings:\n  - {system_product_markings}\nuser_interface:\n  - description: {system_user_interface}\n    link: {system_user_interface_uri}\n    snapshot: {system_user_interface_snapshot_uri}\n\nmodels:\n  - !include {model_card_uri}\n\nassessments:\n  - !include {assessment_card_uri}\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a6/#model-card","title":"Model Card","text":"<pre><code>provenance:\n  git_commit_hash: {git_commit_hash}\n  timestamp: {modification_timestamp}\n  uri: {modification_uri}\n  author: {modification_author}\nlanguage:\n  - {lang_0}\nlicense:\n  license_name: {license_name}\n  license_link: {license_uri}\ntags:\n  - {tag_0}\nowners:\n  - oin: {oin}\n    organization: {organization_name}\n    name: {owner_name}\n    email: {owner_email}\n    role: {owner_role}\n\nmodel-index:\n  - name: {model_id}\n    model: {model_uri}\n    artifacts:\n      - uri: {model_artifact_uri}\n      - content-type: {model_artifact_type}\n      - md5-checksum: {md5_checksum}\n    parameters:\n      - name: {parameter_name}\n        dtype: {parameter_dtype}\n        value: {parameter_value}\n        labels:\n          - name: {label_name}\n            dtype: {label_type}\n            value: {label_value}\n    results:\n      - task:\n          - type: {task_type}\n            name: {task_name}\n        datasets:\n          - type: {dataset_type}\n            name: {dataset_name}\n            split: {split}\n            features:\n              - {feature_name}\n            revision: {dataset_version}\n        metrics:\n          - type: {metric_type}\n            name: {metric_name}\n            dtype: {metric_dtype}\n            value: {metric_value}\n            labels:\n              - name: {label_name}\n                type: {label_type}\n                dtype: {label_type}\n                value: {label_value}\n        measurements:\n          bar_plots:\n            - type: {measurement_type}\n              name: {measurement_name}\n              results:\n                - name: {bar_name}\n                  value: {bar_value}\n          graph_plots:\n            - type: {measurement_type}\n              name: {measurement_name}\n              results:\n                - class: {class_name}\n                  feature: {feature_name}\n                  data:\n                    - x_value: {x_value}\n                      y_value: {y_value}\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a6/#assessment-card","title":"Assessment Card","text":"<pre><code>provenance:\n  git_commit_hash: {git_commit_hash}\n  timestamp: {modification_timestamp}\n  uri: {modification_uri}\n  author: {modification_author}\nname: {assessment_name}\ndate: {assessment_date}\ncontents:\n  - question: {question_text}\n    answer: {answer_text}\n    remarks: {remarks_text}\n    authors:\n      - name: {author_name}\n    timestamp: {timestamp}\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a6/#schema","title":"Schema","text":"<p>JSON schema will be added when we publish the first beta version.</p>"},{"location":"projects/tad/reporting-standard/0.1a6/#changelog","title":"Changelog","text":"<ul> <li>0.1a6:<ul> <li>fix mismatches between description and examples</li> <li>format YAML examples and Markdown formatting</li> </ul> </li> <li>0.1a5: adds a general description of the technical documentation required for high-risk systems to conform to the EU   AI Act.</li> <li>0.1a4: adds data provenance</li> <li>0.1a3: require ISO 8601 timestamp</li> <li>0.1a2: introduces typed artifacts</li> <li>0.1a1: initial draft version of this reporting standard</li> </ul> <ol> <li> <p>Deviation from the Hugging Face specification is in the License field. Hugging Face only accepts dataset id's   from Hugging Face license list while we accept any   license from Open Source License List.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the <code>model_index:results:dataset</code> field. Hugging Face only   accepts one dataset, while we accept a list of datasets.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the Dataset Type field. Hugging Face only accepts dataset id's   from Hugging Face datasets while we also allow for any url pointing to the dataset.\u00a0\u21a9\u21a9</p> </li> <li> <p>For this extension to work relevant metrics (such as for example false positive rate) have to be added to the   Hugging Face metrics, possibly this can be done in our organizational namespace.\u00a0\u21a9\u21a9</p> </li> </ol>"},{"location":"projects/tad/reporting-standard/latest/","title":"0.1a7 (Latest)","text":"<p>This document describes the Transparency of Algorithmic Decision making (TAD) Reporting Standard.</p> <p>For reproducibility, governance, auditing and sharing of algorithmic systems it is essential to have a reporting standard so that information about an algorithmic system can be shared. This reporting standard describes how information about the different phases of an algorithm's life cycle can be reported. It contains, among other things, descriptive information combined with information about the technical tests and assessments applied.</p> <p>Disclaimer</p> <p>The TAD Reporting Standard is work in progress. This means that the current standard is probably suboptimal and will change significantly in future versions.</p>"},{"location":"projects/tad/reporting-standard/latest/#introduction","title":"Introduction","text":"<p>Inspired by Model Cards for Model Reporting and Papers with Code Model Index this standard almost<sup>1</sup> <sup>2</sup> <sup>3</sup> <sup>4</sup> extends the Hugging Face model card metadata specification to allow for:</p> <ol> <li>More fine-grained information on performance metrics, by extending the <code>metrics_field</code> from the Hugging    Face metadata specification.</li> <li>Capturing additional measurements on fairness and bias, which can be partitioned into bar plot like    measurements (such as mean absolute SHAP values) and graph plot like measurements (such as partial dependence). This    is achieved by defining a new field <code>measurements</code>.</li> <li>Capturing assessments (such as    IAMA    and ALTAI).    This is achieved by defining a new field <code>assessments</code>.</li> </ol> <p>Following Hugging Face, this proposed standard will be written in YAML.</p> <p>This standard does not contain all fields present in the Hugging Face metadata specification. The fields that are optional in the Hugging Face specification and are specific to the Hugging Face interface are omitted.</p> <p>Another difference is that we divide our implementation into three separate parts.</p> <ol> <li><code>system_card</code>, containing information about a group of ML-models which accomplish a specific task.</li> <li><code>model_card</code>, containing information about a specific data science model.</li> <li><code>assessment_card</code>, containing information about a regulatory assessment.</li> </ol> <p>Include statements</p> <p>These <code>model_card</code>s and  <code>assessment_card</code>s  can be included verbatim into a <code>system_card</code>, or referenced with an <code>!include</code> statement, allowing for minimal cards to be compact in a single file. Extensive cards can be split up for readability and maintainability. Our standard allows for the <code>!include</code> to be used anywhere.</p>"},{"location":"projects/tad/reporting-standard/latest/#specification-of-the-standard","title":"Specification of the standard","text":"<p>The standard will be written in YAML. Example YAML files are given in the next section. The standard defines three cards: a <code>system_card</code>, a <code>model_card</code> and an <code>assessment_card</code>. A <code>system_card</code> contains information about an algorithmic system. It can have multiple models and each of these models should have a <code>model_card</code>. Regulatory assessments can be processed in an <code>assessment_card</code>. Note that <code>model_card</code>'s and <code>assessment_card</code>'s can be included directly into the <code>system_card</code> or can be included as separate YAML files with help of a YAML-include mechanism. For clarity the latter is preferred and is also used in the examples in the next section.</p>"},{"location":"projects/tad/reporting-standard/latest/#system_card","title":"<code>system_card</code>","text":"<p>A <code>system_card</code> contains the following information.</p> <ol> <li><code>schema_version</code> (REQUIRED, string). Version of the schema used, for example \"0.1a2\".</li> <li> <p><code>provenance</code> (OPTIONAL). In case this System Card is generated from another source file, this field can capture the    historical context of the contents of this System Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used     to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.     Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in     ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>name</code> (OPTIONAL, string). Name used to describe the system.</p> </li> <li><code>upl</code> (OPTIONAL, string). If this algorithm is part of a product offered by the Dutch Government,     it should contain a URI from the     Uniform Product List.</li> <li> <p><code>owners</code> (OPTIONAL, list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the    Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this    field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is    set.</li> </ol> </li> <li> <p><code>description</code> (OPTIONAL, string). A short description of the system.</p> </li> <li> <p><code>labels</code> (OPTIONAL, list). This fields allows to store meta information about a system. There can be multiple labels.    For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the label.</li> <li><code>value</code> (OPTIONAL, string). Value of the label.</li> </ol> </li> <li> <p><code>status</code> (OPTIONAL, string). The status of the system. For example the status can be \"production\".</p> </li> <li><code>publication_category</code> (OPTIONAL, enum[string]). The publication category of the algorithm should be chosen from    <code>[\"high_risk\", other\"]</code>.</li> <li><code>begin_date</code> (OPTIONAL, string). The first date the system was used. Date should be given in     ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>end_date</code> (OPTIONAL, string). The last date the system was used. Date should be given in     ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>goal_and_impact</code> (OPTIONAL, string). The purpose of the system and the impact it has on citizens and companies.</li> <li><code>considerations</code> (OPTIONAL, string). The pro's and con's of using the system.</li> <li><code>risk_management</code> (OPTIONAL, string). Description of the risks associated with the system.</li> <li><code>human_intervention</code> (OPTIONAL, string). A description to want extend there is human involvement in the system.</li> <li> <p><code>legal_base</code> (OPTIONAL, list). If there exists a legal base for the process the system is embedded in, this field     can be filled in with the relevant laws. There can be multiple legal bases. For each legal base the following fields     are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the law.</li> <li><code>link</code> (OPTIONAL, string). URI pointing towards the contents of the law.</li> </ol> </li> <li> <p><code>used_data</code> (OPTIONAL, string). An overview of the data that is used in the system.</p> </li> <li><code>technical_design</code> (OPTIONAL, string). Description on how the system works.</li> <li> <p><code>external_providers</code> (OPTIONAL, list). If relevant, these fields allow to store information on external providers.     There can be multiple external providers.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the external provider.</li> <li><code>version</code> (OPTIONAL, string). Version of the external provider reflecting its relation to previous versions.</li> </ol> </li> <li> <p><code>references</code> (OPTIONAL, list[string]). Additional reference URI's that point information about the system and are     relevant.</p> </li> <li><code>interaction_details</code> (OPTIONAL, list[string]). Explain how the AI system interacts with hardware or software,     including other AI systems, or how the AI system can be used to interact with hardware or software.</li> <li><code>version_requirements</code> (OPTIONAL, list[string]). Describe the versions of the relevant software or firmware, and any     requirements related to version updates.</li> <li><code>deployment_variants</code> (OPTIONAL, list[string]). Description of all the forms in which the AI system is placed on the     market or put into service, such as software packages embedded into hardware, downloads, or APIs.</li> <li><code>hardware_requirements</code> (OPTIONAL, list[string]). Provide a description of the hardware on which the AI system must     be run.</li> <li><code>product_markings</code> (OPTIONAL, list[string]). If the AI system is a component of products, photos, or illustrations,     describe the external features, markings, and internal layout of those products.</li> <li> <p><code>user_interface</code> (OPTIONAL, list). Provide information on the user interface provided to the user responsible for     its operation.</p> <ol> <li><code>description</code> (OPTIONAL, string). A description of the provided user interface.</li> <li><code>link</code> (OPTIONAL, string). A link to the user interface can be included.</li> <li><code>snapshot</code> (OPTIONAL, string). A snapshot/screenshot of the user interface can be included with the use of a     hyperlink.</li> </ol> </li> <li> <p><code>models</code> (OPTIONAL, list[ModelCard]). A list of model cards (as defined below) or <code>!include</code>s of a YAML file     containing a model card. This model card can for example be a model card described in the next section or a model     card from Hugging Face. There can be multiple model cards, meaning multiple models are used.</p> </li> <li> <p><code>assessments</code> (OPTIONAL, list[AssessmentCard]). A list of assessment cards (as defined below) or <code>!include</code>s of a     YAML file containing a assessment card. This assessment card is an assessment card described in the next section.     There can be multiple assessment cards, meaning multiple assessment were performed.</p> </li> </ol>"},{"location":"projects/tad/reporting-standard/latest/#model_card","title":"<code>model_card</code>","text":"<p>A <code>model_card</code> contains the following information.</p> <ol> <li> <p><code>provenance</code> (OPTIONAL). In case this Model Card is generated from another source file, this field can capture the    historical context of the contents of this Model Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used     to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.    Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in    ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>language</code> (OPTIONAL, list[string]). If relevant, the natural languages the model supports in    ISO 639. There can be multiple languages.</p> </li> <li> <p><code>license</code> (REQUIRED).</p> <ol> <li><code>license_name</code> (REQUIRED, string). Any license from the    open source license list<sup>1</sup>. If the license is NOT present in the license list    this field must be set to 'other' and the following two fields will be REQUIRED.</li> <li><code>license_link</code> (OPTIONAL, string). A link to a file of that name inside the repo, or a URL to a remote file    containing the license contents.</li> </ol> </li> <li> <p><code>tags</code> (OPTIONAL, list[string]). Tags with keywords to describe the project. There can be multiple tags.</p> </li> <li> <p><code>owners</code> (OPTIONAL, list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the    Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this    field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is     set.</li> </ol> </li> <li> <p><code>model_index</code> (REQUIRED, list). There can be multiple models. For each model the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the model.</li> <li><code>model</code> (REQUIRED, string). A URI pointing to a repository containing the model file.</li> <li> <p><code>artifacts</code> (OPTIONAL, list). A list of artifacts</p> <ol> <li><code>uri</code> (OPTIONAL, string) URI refers to a relevant model artifact</li> <li><code>content-type</code> (OPTIONAL, string) Optional type, follow the    Content-Type. Recognized values are    \"application/onnx\"\", to refer to an ONNX representation of the model.</li> <li><code>md5-checksum</code> (OPTIONAL, string) Optional checksum for the content of the file.</li> </ol> </li> <li> <p><code>parameters</code> (OPTIONAL, list). There can be multiple parameters. For each parameter the following fields are    present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the parameter, for example \"epochs\".</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the parameter, for example \"int\".</li> <li><code>value</code> (OPTIONAL, string). The value of the parameter, for example 100.</li> <li> <p><code>labels</code> (OPTIONAL, list). This field allows to store meta information about a parameter. There can be    multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the label.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED.</li> </ol> </li> </ol> </li> <li> <p><code>results</code> (OPTIONAL, list). There can be multiple results. For each result the following fields are present.</p> <ol> <li> <p><code>task</code> (OPTIONAL, list).</p> <ol> <li><code>task_type</code> (REQUIRED, string). The task of the model, for example \"object-classification\".</li> <li><code>task_name</code> (OPTIONAL, string). A pretty name for the model tasks, for example \"Object Classification\".</li> </ol> </li> <li> <p><code>datasets</code> (OPTIONAL, list). There can be multiple datasets <sup>2</sup>. For each dataset the following fields are    present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the dataset, can be a dataset id from    Hugging Face datasets or any other link to a repository containing the    dataset<sup>3</sup>, for example \"common_voice\".</li> <li><code>name</code> (REQUIRED, string). Name pretty name for the dataset, for example \"Common Voice (French)\".</li> <li><code>split</code> (OPTIONAL, string). The split of the dataset, for example \"train\".</li> <li><code>features</code> (OPTIONAL, list[string]). List of feature names.</li> <li><code>revision</code> (OPTIONAL, string). Version of the dataset, for example    \"5503434ddd753f426f4b38109466949a1217c2bb\".</li> </ol> </li> <li> <p><code>metrics</code> (OPTIONAL, list). There can be multiple metrics. For each metric the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). A metric-id from Hugging Face metrics<sup>4</sup>, for    example accuracy.</li> <li><code>name</code> (REQUIRED, string). A descriptive name of the metric. For example \"false positive rate\" is not a    descriptive name, but \"training false positive rate w.r.t class x\" is.</li> <li><code>dtype</code> (REQUIRED, string). The data type of the metric, for example <code>float</code>.</li> <li><code>value</code> (REQUIRED, string). The value of the metric.</li> <li> <p><code>labels</code> (OPTIONAL, list). This field allows to store meta information about a metric. For example,    metrics can be computed for example on subgroups of specific features. For example, one can compute the    accuracy for examples where the feature \"gender\" is set to \"male\". There can be multiple subgroups, which    means that the metric is computed on the intersection of those subgroups. There can be multiple labels.    For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the feature. For example: \"gender\".</li> <li><code>type</code> (OPTIONAL, string). The type of the label. Can for example be set to \"feature\" or    \"output_class\". If <code>name</code> is set, this field is REQUIRED.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature, for example <code>float</code>. If <code>name</code> is set, this    field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED. For    example: \"male\".</li> </ol> </li> </ol> </li> <li> <p><code>measurements</code>.</p> <ol> <li> <p><code>bar_plots</code> (OPTIONAL, list). The purpose of this field is to capture bar plot like measurements, for     example SHAP values. There can be multiple bar plots. For each bar plot the following fields are     present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of bar plot, for example \"SHAP\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name for the plot, for example \"Mean Absolute SHAP Values\".</li> <li> <p><code>results</code> (REQUIRED, list). The contents of the bar plot. A result represents a bar. There can be    multiple results. For each result the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of bar.</li> <li><code>value</code> (REQUIRED, float). The value of the corresponding bar.</li> </ol> </li> </ol> </li> <li> <p><code>graph_plots</code> (OPTIONAL, list). The purpose of this field is to capture graph plot like measurements,    such as partial dependence plots. There can be multiple graph plots. For each graph plot the following    fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the graph plot, for example \"partial_dependence\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name of the graph, for example \"Partial Dependence Plot\".</li> <li> <p><code>results</code> (REQUIRED, list). Results contains the graph plot data. Each graph can depend on a specific    output class and feature. There can be multiple results. For each result the following fields are    present.</p> <ol> <li><code>class</code> (OPTIONAL, string/int/float/bool). The output class name that the graph corresponds to.    This field is not always present.</li> <li><code>feature</code> (REQUIRED, string). The feature the graph corresponds to. This is required, since all    relevant graphs are dependent on features.</li> <li> <p><code>data</code> (REQUIRED, list)</p> <ol> <li><code>x_value</code> (REQUIRED, float). The $x$-value of the graph.</li> <li><code>y_value</code> (REQUIRED, float). The $y$-value of the graph.</li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/latest/#assessment_card","title":"<code>assessment_card</code>","text":"<p>An <code>assessment_card</code> contains the following information.</p> <ol> <li> <p><code>provenance</code> (OPTIONAL). In case this Assessment Card is generated from another source file, this field can capture    the historical context of the contents of this Assessment Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used    to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.    Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in    ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>name</code> (REQUIRED, string). The name of the assessment.</p> </li> <li><code>urn</code> (OPTIONAL, string). A Uniform Resource Name (URN) of the instrument in the instrument register.</li> <li><code>date</code> (REQUIRED, string). The date at which the assessment is completed. Date should be given in    ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li> <p><code>contents</code> (REQUIRED, list). There can be multiple items in contents. For each item the following fields are present:</p> <ol> <li><code>question</code> (REQUIRED, string). A question.</li> <li><code>urn</code> (OPTIONAL, string). A Uniform Resource Name (URN) of the corresponding task in the instrument register.</li> <li><code>answer</code> (REQUIRED, string). An answer.</li> <li><code>remarks</code> (OPTIONAL, string). A field to put relevant discussion remarks in.</li> <li> <p><code>authors</code> (OPTIONAL, list). There can be multiple names. For each name the following field is present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the author of the question.</li> </ol> </li> <li> <p><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of the answer. Timestamp should be     given, preferably in UTC (represented as <code>Z</code>), in     ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</p> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/latest/#example","title":"Example","text":""},{"location":"projects/tad/reporting-standard/latest/#system-card","title":"System Card","text":"<pre><code>version: {system_card_version}\nprovenance:\n  git_commit_hash: {git_commit_hash}\n  timestamp: {modification_timestamp}\n  uri: {modification_uri}\n  author: {modification_author}\nname: {system_name}\nupl: {upl_uri}\nowners:\n  - oin: {oin}\n    organization: {organization_name}\n    name: {owner_name}\n    email: {owner_email}\n    role: {owner_role}\ndescription: {system_description}\nlabels:\n  - name: {label_name}\n    value: {label_value}\nstatus: {system_status}\npublication_category: {system_publication_cat}\nbegin_date: {system_begin_date}\nend_date: {system_end_date}\ngoal_and_impact: {system_goal_and_impact}\nconsiderations: {system_considerations}\nrisk_management: {system_risk_management}\nhuman_intervention: {system_human_intervention}\nlegal_base:\n  - name: {law_name}\n    link: {law_uri}\nused_data: {system_used_data}\ntechnical_design: {technical_design}\nexternal_providers:\n  - name: {name_external_provider}\n    version: {version_external_provider}\nreferences:\n  - {reference_uri}\ninteraction_details:\n  - {system_interaction_details}\nversion_requirements:\n  - {system_version_requirements}\ndeployment_variants:\n  - {system_deployment_variants}\nhardware_requirements:\n  - {system_hardware_requirements}\nproduct_markings:\n  - {system_product_markings}\nuser_interface:\n  - description: {system_user_interface}\n    link: {system_user_interface_uri}\n    snapshot: {system_user_interface_snapshot_uri}\n\nmodels:\n  - !include {model_card_uri}\n\nassessments:\n  - !include {assessment_card_uri}\n</code></pre>"},{"location":"projects/tad/reporting-standard/latest/#model-card","title":"Model Card","text":"<pre><code>provenance:\n  git_commit_hash: {git_commit_hash}\n  timestamp: {modification_timestamp}\n  uri: {modification_uri}\n  author: {modification_author}\nlanguage:\n  - {lang_0}\nlicense:\n  license_name: {license_name}\n  license_link: {license_uri}\ntags:\n  - {tag_0}\nowners:\n  - oin: {oin}\n    organization: {organization_name}\n    name: {owner_name}\n    email: {owner_email}\n    role: {owner_role}\n\nmodel-index:\n  - name: {model_id}\n    model: {model_uri}\n    artifacts:\n      - uri: {model_artifact_uri}\n      - content-type: {model_artifact_type}\n      - md5-checksum: {md5_checksum}\n    parameters:\n      - name: {parameter_name}\n        dtype: {parameter_dtype}\n        value: {parameter_value}\n        labels:\n          - name: {label_name}\n            dtype: {label_type}\n            value: {label_value}\n    results:\n      - task:\n          - type: {task_type}\n            name: {task_name}\n        datasets:\n          - type: {dataset_type}\n            name: {dataset_name}\n            split: {split}\n            features:\n              - {feature_name}\n            revision: {dataset_version}\n        metrics:\n          - type: {metric_type}\n            name: {metric_name}\n            dtype: {metric_dtype}\n            value: {metric_value}\n            labels:\n              - name: {label_name}\n                type: {label_type}\n                dtype: {label_type}\n                value: {label_value}\n        measurements:\n          bar_plots:\n            - type: {measurement_type}\n              name: {measurement_name}\n              results:\n                - name: {bar_name}\n                  value: {bar_value}\n          graph_plots:\n            - type: {measurement_type}\n              name: {measurement_name}\n              results:\n                - class: {class_name}\n                  feature: {feature_name}\n                  data:\n                    - x_value: {x_value}\n                      y_value: {y_value}\n</code></pre>"},{"location":"projects/tad/reporting-standard/latest/#assessment-card","title":"Assessment Card","text":"<pre><code>provenance:\n  git_commit_hash: {git_commit_hash}\n  timestamp: {modification_timestamp}\n  uri: {modification_uri}\n  author: {modification_author}\nname: {assessment_name}\nurn: {urn}\ndate: {assessment_date}\ncontents:\n  - question: {question_text}\n    urn: {urn}\n    answer: {answer_text}\n    remarks: {remarks_text}\n    authors:\n      - name: {author_name}\n    timestamp: {timestamp}\n</code></pre>"},{"location":"projects/tad/reporting-standard/latest/#schema","title":"Schema","text":"<p>JSON schema will be added when we publish the first beta version.</p>"},{"location":"projects/tad/reporting-standard/latest/#changelog","title":"Changelog","text":"<ul> <li>0.1a7: adds urn to assessment card</li> <li>0.1a6:<ul> <li>fix mismatches between description and examples</li> <li>format YAML examples and Markdown formatting</li> </ul> </li> <li>0.1a5: adds a general description of the technical documentation required for high-risk systems to conform to the EU   AI Act.</li> <li>0.1a4: adds data provenance</li> <li>0.1a3: require ISO 8601 timestamp</li> <li>0.1a2: introduces typed artifacts</li> <li>0.1a1: initial draft version of this reporting standard</li> </ul> <ol> <li> <p>Deviation from the Hugging Face specification is in the License field. Hugging Face only accepts dataset id's   from Hugging Face license list while we accept any   license from Open Source License List.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the <code>model_index:results:dataset</code> field. Hugging Face only   accepts one dataset, while we accept a list of datasets.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the Dataset Type field. Hugging Face only accepts dataset id's   from Hugging Face datasets while we also allow for any url pointing to the dataset.\u00a0\u21a9\u21a9</p> </li> <li> <p>For this extension to work relevant metrics (such as for example false positive rate) have to be added to the   Hugging Face metrics, possibly this can be done in our organizational namespace.\u00a0\u21a9\u21a9</p> </li> </ol>"},{"location":"way-of-working/code-reviews/","title":"Code reviews","text":"<p>The purpose of a code review is to ensure the quality, readability, and that all requirements from the ticket have been met for a change before it gets merged into the main codebase. Additionally, code reviews are a communication tool, they allow team members to stay aware of changes being made.</p> <p>Code reviews involve having a team member examine the changes made by another team member and give feedback or ask questions if needed.</p>"},{"location":"way-of-working/code-reviews/#creating-a-pull-request","title":"Creating a Pull Request","text":"<p>We use GitHub pull requests (PR) for code reviews. You can make a draft PR if your work is still in progress. When you are done you can remove the draft status. A team member may start reviewing when the PR does not have a draft status.</p> <p>For team ADRs at least 3 accepting reviews are required, or all team members should accept if it can be expected that the ADR is controversial.</p> <p>A team ADR is an ADR made in the ai-validation repository.</p> <p>All other PRs only need at least 1 reviewer to get accepted, but can have more reviewers if desired (by either reviewer or author).</p>"},{"location":"way-of-working/code-reviews/#review-process","title":"Review process","text":"<p>By default the codeowner, indicated in the CODEOWNER file, will be requested to review. For us this is the GitHub team AI-validation. If the PR creator wants a specific team member to review, the PR creator should add the team member specifically in the reviewers section of the PR. A message in Mattermost will be posted for PRs. Then with the reaction of an emoji a reviewer will indicate they are looking at the PR.</p> <p>If the reviewer has suggestions or comments the PR creator can fix those or add comments to the suggestions. When the creator of the PR thinks he is done with the feedback he must re-request a review from the person that did the review. The reviewer must then look at the changes and approve or add more comments. This process continues until the reviewer agrees that all is correct and approves the PR.</p> <p>Once the review is approved the reviewer checks if the branch is in sync with the main branch before merging. If not, the reviewer rebases the branch. Once the branch is in sync with main the reviewer merges the PR and checks if the deployment is successful. If the deployment is not successful the reviewer fixes it. If the PR needs more than one review, the last accepting reviewer merges the PR.</p>"},{"location":"way-of-working/contributing/","title":"Contributing to AI Validation","text":"<p>First off, thanks for taking the time to contribute! \u2764\ufe0f</p> <p>All types of contributions are encouraged and valued. See the Table of Contents for different ways to help and details about how this project handles them. Please make sure to read the relevant section before making your contribution. It will make it a lot easier for us maintainers and smooth out the experience for all involved. The community looks forward to your contributions. \ud83c\udf89</p>"},{"location":"way-of-working/contributing/#table-of-contents","title":"Table of Contents","text":"<ul> <li>Code of Conduct</li> <li>I Have a Question</li> <li>I Want To Contribute</li> <li>Reporting Bugs</li> <li>Suggesting Enhancements</li> <li>Styleguides</li> <li>Commit Messages</li> </ul>"},{"location":"way-of-working/contributing/#code-of-conduct","title":"Code of Conduct","text":"<p>This project and everyone participating in it is governed by the Code of Conduct. By participating, you are expected to uphold this code. Please report unacceptable behavior to ai-validatie@minbzk.nl.</p>"},{"location":"way-of-working/contributing/#i-have-a-question","title":"I Have a Question","text":"<p>Before you ask a question, it is best to search for existing Issues that might help you. In case you have found a suitable issue and still need clarification, you can write your question in this issue.</p> <p>If you then still feel the need to ask a question and need clarification, we recommend the following:</p> <ul> <li>Open an Issue.</li> <li>Provide as much context as you can about what you're running into.</li> </ul> <p>We will then take care of the issue as soon as possible.</p>"},{"location":"way-of-working/contributing/#i-want-to-contribute","title":"I Want To Contribute","text":""},{"location":"way-of-working/contributing/#legal-notice","title":"Legal Notice","text":"<p>When contributing to this project, you must agree that you have authored 100% of the content, that you have the necessary rights to the content and that the content you contribute may be provided under the project license.</p>"},{"location":"way-of-working/contributing/#reporting-bugs","title":"Reporting Bugs","text":""},{"location":"way-of-working/contributing/#before-submitting-a-bug-report","title":"Before Submitting a Bug Report","text":"<p>A good bug report shouldn't leave others needing to chase you up for more information. Therefore, we ask you to investigate carefully, collect information and describe the issue in detail in your report. Please complete the following steps in advance to help us fix any potential bug as fast as possible.</p> <ul> <li>Make sure that you are using the latest version.</li> <li>To see if other users have experienced (and potentially already solved) the same issue you are having, check if there is not already a bug report existing for your bug or error in the bug tracker.</li> <li>Collect information about the bug</li> <li>Possibly your input and the output</li> </ul>"},{"location":"way-of-working/contributing/#how-do-i-submit-a-good-bug-report","title":"How Do I Submit a Good Bug Report?","text":"<p>You must never report security related issues, vulnerabilities or bugs including sensitive information to the issue tracker, or elsewhere in public. Instead sensitive bugs must be sent by email to ai-validatie@minbzk.nl.</p> <p>We use GitHub issues to track bugs and errors. If you run into an issue with the project:</p> <ul> <li>Open an Issue. (Since we can't be sure at this point whether it   is a bug or not, we ask you not to talk about a bug yet and not to label the issue.)</li> <li>Explain the behavior you would expect and the actual behavior.</li> <li>Please provide as much context as possible and describe the reproduction steps that someone else can follow to   recreate the issue on their own. This usually includes your code.</li> <li>Provide the information you collected in the previous section.</li> </ul> <p>Once it's filed:</p> <ul> <li>The project team will label the issue accordingly.</li> <li>A team member will try to reproduce the issue with your provided steps. If there are no reproduction steps or no   obvious way to reproduce the issue, the team will ask you for those steps and mark the issue as <code>needs-repro</code>. Bugs   with the <code>needs-repro</code> tag will not be addressed until they are reproduced.</li> <li>If the team is able to reproduce the issue, it will be marked <code>needs-fix</code>, as well as possibly other tags (such as   <code>critical</code>), and the issue will be left to be implemented by someone.</li> </ul>"},{"location":"way-of-working/contributing/#suggesting-enhancements","title":"Suggesting Enhancements","text":"<p>This section guides you through submitting an enhancement suggestion for this project, including completely new features and minor improvements. Following these guidelines will help maintainers and the community to understand your suggestion and find related suggestions.</p>"},{"location":"way-of-working/contributing/#before-submitting-an-enhancement","title":"Before Submitting an Enhancement","text":"<ul> <li>Make sure that you are using the latest version.</li> <li>Perform a search to see if the enhancement has already been   suggested. If it has, add a comment to the existing issue instead of opening a new one.</li> <li>Find out whether your idea fits with the scope and aims of the project. It's up to you to make a strong case to   convince the project's developers of the merits of this feature. Keep in mind that we want features that will be   useful to the majority of our users and not just a small subset.</li> </ul>"},{"location":"way-of-working/contributing/#how-do-i-submit-a-good-enhancement-suggestion","title":"How Do I Submit a Good Enhancement Suggestion?","text":"<p>Enhancement suggestions are tracked as GitHub issues.</p> <ul> <li>Use a clear and descriptive title for the issue to identify the suggestion.</li> <li>Describe the current behavior and explain which behavior you expected to see instead and why. At this point   you can also tell which alternatives do not work for you.</li> <li>You may want to include screenshots and animated GIFs which help you demonstrate the steps or point out the part   which the suggestion is related to. You can use this tool to record GIFs on MacOS   and Windows, and this tool or   this tool on Linux.</li> <li>Explain why this enhancement would be useful for the community. You may also want to point out the   other projects that solved it better and which could serve as inspiration.</li> </ul>"},{"location":"way-of-working/contributing/#styleguides","title":"Styleguides","text":""},{"location":"way-of-working/contributing/#commit-messages","title":"Commit Messages","text":"<p>We have commit message conventions: Commit convention</p>"},{"location":"way-of-working/contributing/#markdown-lint","title":"Markdown Lint","text":"<p>We use Markdown lint to standardize Markdown: Markdown lint config.</p>"},{"location":"way-of-working/contributing/#pre-commit","title":"Pre-commit","text":"<p>We use pre-commit to enabled standardization: pre-commit config.</p>"},{"location":"way-of-working/decision-log/","title":"Decision Log","text":"<p>Throughout our work, small decisions about processes and approaches are often made in meetings and chats. While these aren't big enough for formal documentation like ADRs, capturing them is valuable for both current and future team members.</p> <p>This log provides a reference point for those decisions.</p>"},{"location":"way-of-working/decision-log/#overview-of-decisions","title":"Overview of decisions","text":"<ul> <li>We keep project ADRs in the relevant project.</li> </ul>"},{"location":"way-of-working/off-boarding/","title":"Off Boarding","text":"<p>We're sad to see you go! But if you do, here's what not to forget.</p>"},{"location":"way-of-working/off-boarding/#github","title":"GitHub","text":"<ul> <li>Make sure you transfer any ownership you may have to the team.</li> <li>Make sure to leave our team  AI Validation Team.</li> <li>If you are leaving MinBZK altogether, make sure to leave MinBZK Org.</li> </ul>"},{"location":"way-of-working/off-boarding/#kubernetes","title":"Kubernetes","text":"<ul> <li>Make sure your access to our Kubernetes clusters is removed.</li> </ul>"},{"location":"way-of-working/off-boarding/#collaboration-space","title":"Collaboration space","text":"<ul> <li>Leave our collaboration space Team Collaboration Space.</li> </ul>"},{"location":"way-of-working/off-boarding/#pleio-community","title":"Pleio community","text":"<ul> <li>Make sure you transfer any ownership you may have to someone else in the team.</li> </ul>"},{"location":"way-of-working/off-boarding/#shared-mailbox","title":"Shared mailbox","text":"<ul> <li>Make sure to leave our  shared mailbox by sending an email to Secretariat of Digital   Society (can be found in the Outlook address book).</li> </ul>"},{"location":"way-of-working/off-boarding/#teams-page","title":"Teams page","text":"<ul> <li>Move yourself to our alumni section on the team page.</li> </ul>"},{"location":"way-of-working/off-boarding/#webex","title":"Webex","text":"<ul> <li>Leave our Webex Team.</li> </ul>"},{"location":"way-of-working/off-boarding/#signal","title":"Signal","text":"<ul> <li>Leave all the relevant Signal groups.</li> </ul>"},{"location":"way-of-working/off-boarding/#mattermost-chat","title":"Mattermost chat","text":"<ul> <li>Leave our private Mattermost channels.</li> <li>You are welcome to stay in our \"Mattermost team\" and in the public channels there.</li> </ul>"},{"location":"way-of-working/principles/","title":"Our Principles","text":"<ol> <li>Our strong trust in the government and the dedication of people at all levels within the government organization is the basis of our actions.</li> <li>The interests of the citizen and society take precedence in all our activities.</li> <li>Learning and knowledge sharing are central: we encourage team members to take on tasks that are new or less familiar to them.</li> <li>Existing knowledge, policies, and proven methods are actively reused and shared.</li> <li>We strive for maximum openness and transparency in all our processes.</li> <li>We prefer the use and creation of Open Source Software.</li> <li>Our team members can choose to work anonymously.</li> <li>We treat each other with respect.</li> <li>Collaboration is essential to our success; we actively seek collaboration with both public and private partners.</li> </ol>"},{"location":"way-of-working/ubiquitous_language/","title":"Ubiquitous Language","text":"<p>For clarity and consistency, this document defines some terms used within our team where the meaning in Data Science or Computer Science differs, and terms that are for any reason good to mention.</p> <p>For a full reference for Machine Learning, we recommend ML Fundamentals from Google.</p>"},{"location":"way-of-working/onboarding/","title":"Onboarding","text":"<ul> <li>Start by setting up your dev machine.</li> <li>Then create your accounts.</li> <li>Read our ADRs.</li> <li>Read our principles.</li> <li>Read our Decision Log.</li> <li>Finally, add yourself to our team page (you can stay anonymous if you want, see   our principles).</li> </ul>"},{"location":"way-of-working/onboarding/accounts/","title":"Accounts","text":""},{"location":"way-of-working/onboarding/accounts/#mattermost-chat","title":"Mattermost chat","text":"<p>Make sure you have installed Mattermost, then follow these steps.</p> <ul> <li>Register on Pleio with your @rijksoverheid/@minbzk.nl email address.</li> <li>Login-&gt; Gitlab -&gt; Pleio.</li> <li>Ask a team member to add you to the RIG and AI Validation team.</li> <li>Make sure to add a recognizable profile picture</li> </ul>"},{"location":"way-of-working/onboarding/accounts/#webex","title":"Webex","text":"<p>Make sure you have installed Webex, then follow these steps.</p> <ul> <li>Create an account with your @rijksoverheid/@minbzk.nl email address.</li> <li>Ask a team member to add to you to the AI Validation team.</li> </ul>"},{"location":"way-of-working/onboarding/accounts/#tuple","title":"Tuple","text":"<p>Make sure you have installed Tuple, then follow these steps.</p> <ul> <li>Create an account with your @rijksoverheid/@minbzk.nl email address.</li> <li>Ask a team member to add to you to the AI Validation team.</li> </ul>"},{"location":"way-of-working/onboarding/accounts/#github","title":"GitHub","text":"<p>Create or use your existing GitHub account.</p> <ul> <li>Add your @rijksoverheid/@minbzk.nl email address to your account.</li> <li>Or create a new account (anonymous if you want, see our Principles)</li> <li>Make sure to add a profile picture</li> <li>Ask a team member to add you to the MinBZK Org.</li> <li>Ask a team member to add to you to the AI Validation Team.</li> </ul>"},{"location":"way-of-working/onboarding/accounts/#collaboration-space","title":"Collaboration space","text":"<ul> <li>Ask any team member to add you to the Team Collaboration Space.</li> </ul>"},{"location":"way-of-working/onboarding/accounts/#open-up-your-calendar","title":"Open up your calendar","text":"<ul> <li>In Outlook, right-click your calendar</li> <li>Properties</li> <li>Enable read access</li> </ul>"},{"location":"way-of-working/onboarding/accounts/#shared-mailbox","title":"Shared mailbox","text":"<ul> <li>Ask a colleague to add you to the shared contact address by sending an email to Secretariat   of Digital Society.</li> <li>In Outlook, go to Account Settings and \"Add Account\", leave the password empty</li> </ul>"},{"location":"way-of-working/onboarding/accounts/#bookmark","title":"Bookmark","text":"<p>Bookmark these links in your browser:</p> <ul> <li>Team Collaboration Space</li> <li>Sprint Board</li> <li>Webex Room</li> </ul>"},{"location":"way-of-working/onboarding/accounts/#secrets","title":"Secrets","text":"<p>We use HashiCorp Vault secrets manager for team secrets. You can login with a GitHub Personal access token. The token needs organization read permissions (<code>read:org</code>), and you should be part of our GitHub team to access the vault.</p>"},{"location":"way-of-working/onboarding/dev-machine/","title":"Setting up your Dev Machine","text":"<p>We are assuming your dev machine is a Mac. This guide is rather opinionated, feel free to have your own opinion, and feel free to contribute! Contributing can be done by clicking \"edit\" top right and by making a pull request on this repository.</p>"},{"location":"way-of-working/onboarding/dev-machine/#things-that-should-have-been-default-on-mac","title":"Things that should have been default on Mac","text":"<ul> <li>Keep awake with Amphetamine</li> <li>Office DisplayLink software</li> <li> <p>Homebrew as the missing Package Manager</p> <pre><code>/bin/bash -c \"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)\"\n</code></pre> </li> <li> <p>Rectangle</p> <pre><code>brew install --cask rectangle\n</code></pre> </li> </ul>"},{"location":"way-of-working/onboarding/dev-machine/#citrix-workspace","title":"Citrix workspace","text":"<ul> <li>Citrix workspace</li> <li>Flex2Rijk to login</li> </ul>"},{"location":"way-of-working/onboarding/dev-machine/#communication","title":"Communication","text":"<ul> <li> <p>WebEx for video conferencing</p> <pre><code>brew install --cask webex\n</code></pre> </li> <li> <p>Mattermost for team communication</p> <pre><code>brew install --cask mattermost\n</code></pre> </li> </ul>"},{"location":"way-of-working/onboarding/dev-machine/#terminal-and-shell","title":"Terminal and shell","text":"<ul> <li> <p>Iterm2</p> <pre><code>brew install --cask iterm2\n</code></pre> </li> <li> <p>Oh My Zsh</p> <pre><code>/bin/bash -c \"$(curl -fsSL https://raw.githubusercontent.com/ohmyzsh/ohmyzsh/master/tools/install.sh)\"\n</code></pre> </li> <li> <p>Autosuggestions for zsh</p> <pre><code>git clone https://github.com/zsh-users/zsh-autosuggestions ~/.oh-my-zsh/custom/plugins/zsh-autosuggestions\n</code></pre> </li> <li> <p>Fish shell like syntax highlighting for Zsh</p> <pre><code>brew install zsh-syntax-highlighting\n</code></pre> </li> <li> <p>Add plugins to your shell in <code>~/.zshrc</code></p> <pre><code>plugins = (\n    # other plugins...\n    zsh-autosuggestions\n    kubectl\n    docker\n    docker-compose\n    pyenv\n    z\n)\n</code></pre> </li> <li> <p>Touch ID in Terminal</p> </li> </ul>"},{"location":"way-of-working/onboarding/dev-machine/#coding","title":"Coding","text":"<ul> <li> <p>Sourcetree</p> <pre><code>brew install --cask sourcetree\n</code></pre> </li> <li> <p>Pyenv</p> <pre><code>brew install pyenv\n</code></pre> </li> <li> <p>pyenv virtualenv</p> <pre><code>brew install pyenv-virtualenv\n</code></pre> </li> <li> <p>pre-commit</p> <pre><code>brew install pre-commit\n</code></pre> </li> <li> <p>Xcode Command Line Tools</p> <pre><code>xcode-select --install\n</code></pre> </li> <li> <p>TabbyML Opensource, self-hosted AI coding assistant</p> <p>We can not just use hosted versions of coding assistants because of privacy and copyright issues. We can however use self-hosted coding assistants provided they are trained on data with permissive licenses.</p> <p>StarCoder (1-7B) models are all trained on version 1.2 of The Stack dataset. It boils down to all open GitHub code with permissive licenses (193 licenses in total). Minus opt-out requests.</p> <p>Code Lama and Deepseek models are not clear enough about their data licenses.</p> <pre><code>brew install tabbyml/tabby/tabby\ntabby serve --device metal --model TabbyML/StarCoder-3B\n</code></pre> <p>Then configure your IDE by installing a plugin.</p> </li> <li> <p>Sign commits using SSH</p> </li> </ul>"}]}
\ No newline at end of file
+{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":"<p>Here we are documenting the processes and work of the AI Validation Team at the Ministry of the Interior and Kingdom Relations in The Netherlands.</p> <p>We are a team of engineers, UX designers &amp; researchers, and product experts at a policy department.</p> <p>We work on the following projects within the Transparency of Algorithmic Decision making scope:</p> <pre><code>graph TB\n    ak[&lt;a href='https://minbzk.github.io/Algoritmekader/'&gt;Algoritmekader&lt;/a&gt;] &lt;--&gt; tmt\n\n    subgraph tmt[Algorithm Management Toolkit]\n        st[&lt;a href='/ai-validation/projects/tad/reporting-standard/'&gt;Reporting Standard&lt;/a&gt;] --&gt; tad[&lt;a href='https://github.com/MinBZK/tad/'&gt;Algorithm Management Platform&lt;/a&gt;]\n        tad &lt;--&gt; llm[&lt;a href='/ai-validation/projects/llm-benchmarks/'&gt;LLM Benchmark Tooling&lt;/a&gt;]\n    end\n\n    tmt --&gt; ar[&lt;a href='https://algoritmes.overheid.nl/en/'&gt;The Algorithm Register of the Dutch government&lt;/a&gt;]\n    tmt --&gt; or[Other registries]</code></pre>"},{"location":"#contribute","title":"Contribute","text":"<p>Read our guide on how to contribute.</p>"},{"location":"#contact","title":"Contact","text":"<p>Our contact details are here.</p>"},{"location":"about/contact/","title":"Contact","text":"<p>Contact us at ai-validatie@minbzk.nl.</p>"},{"location":"about/team/","title":"Our Team","text":""},{"location":"about/team/#robbert-bos","title":"Robbert Bos","text":"<p>Product Owner</p> <p></p> <p>Robbert has been on a mission for over 15 years to enhance the transparency and collaboration within AI projects. Before joining this team, he founded several data science and tech companies (partly) dedicated to this cause. Robbert is passionate about solving complex problems where he connects business needs with technology and involves others in how these solutions can improve their work.</p> <p> robbertbos</p> <p> Robbert Bos</p>"},{"location":"about/team/#lucas-haitsma","title":"Lucas Haitsma","text":"<p>Researcher in Residence</p> <p></p> <p>Lucas is PhD candidate conducting research into the regulation and governance of algorithmic discrimination by supervision and enforcement organizations. Lucas is our Researcher in Residence.</p> <p> Lucas Haitsma</p> <p> rug.nl</p>"},{"location":"about/team/#berry-den-hartog","title":"Berry den Hartog","text":"<p>Engineer</p> <p></p> <p>Berry is a software engineer passionate about problem-solving and system optimization, with expertise in Go, Python, and C++. Specialized in architecting high-volume data processing systems and implementing Lean-Agile and DevOps practices. Experienced in managing end-to-end processes from hardware provisioning to software deployment and release.</p> <p> berrydenhartog</p> <p> Berry den Hartog</p>"},{"location":"about/team/#anne-schuth","title":"Anne Schuth","text":"<p>Engineering Manager</p> <p></p> <p>Anne used to be a Machine Learning Engineering Manager at Spotify and previously held roles at DPG Media, Blendle, and Google AI. He holds a PhD from the University of Amsterdam.</p> <p> anneschuth</p> <p> Anne Schuth</p> <p> anneschuth.nl</p>"},{"location":"about/team/#christopher-spelt","title":"Christopher Spelt","text":"<p>Engineer</p> <p></p> <p>After graduating in pure mathematics, Christopher transitioned into machine learning. He is passionate about solving complex problems, especially those that have a societal impact. My expertise lies in math, machine learning theory and I'm skilled in Python.</p> <p> ChristopherSpelt</p> <p> Christopher Spelt</p>"},{"location":"about/team/#robbert-uittenbroek","title":"Robbert Uittenbroek","text":"<p>Engineer</p> <p></p> <p>Robbert is a highly enthusiastic full-stack engineer with a Bachelor's degree in Computer Science from the Hanze University of Applied Sciences in Groningen. He is passionate about building secure, compliant, and ethical solutions, and thrives in collaborative environments. Robbert is eager to leverage his skills and knowledge to help shape and propel the future of IT within the government.</p> <p> uittenbroekrobbert</p> <p> Robbert Uittenbroek</p>"},{"location":"about/team/#laurens-weijs","title":"Laurens Weijs","text":"<p>Engineer</p> <p></p> <p>Laurens is a passionate guy with a love for innovation and doing things differently. With a background in Econometrics and Computer Science he loves to tackle the IT challenges of the Government by helping other people through extensive knowledge sharing on stage, building neural networks himself, or building a strong community.</p> <p> laurensWe</p> <p> Laurens Weijs</p>"},{"location":"about/team/#guusje-juijn","title":"Guusje Juijn","text":"<p>Trainee</p> <p></p> <p>Guusje is currently enrolled in a two-year traineeship at the Dutch Government. After finishing her first assignment at a policy department, she is excited to bring her knowledge about AI policy to a technical team. Guusje has a background in Artificial Intelligence, is experienced in Python and machine learning and has a strong interest in AI ethics.</p> <p> GuusjeJuijn</p> <p> Guusje Juijn</p>"},{"location":"about/team/#ruben-rouwhof","title":"Ruben Rouwhof","text":"<p>UX/UI Designer</p> <p></p> <p>Ruben is a dedicated UX/UI Designer focused on crafting user-centric digital experiences. He is involved in projects from start to finish, covering user research, design, and technical implementation.</p> <p> rubenrouwhof</p> <p> Ruben Rouwhof</p> <p> rubenrouwhof.nl</p>"},{"location":"about/team/#ravi-meijer","title":"Ravi Meijer","text":"<p>Product Researcher</p> <p></p> <p>Ravi is an accomplished data scientist with expertise in machine learning, responsible AI, and the data science lifecycle. Her background in AI fuels her passion for solving complex problems and driving innovation for positive social impact.</p> <p> ravimeijerrig</p> <p> Ravi Meijer</p>"},{"location":"about/team/#our-alumni","title":"Our Alumni","text":""},{"location":"about/team/#willy-tadema","title":"Willy Tadema","text":"<p>AI Ethics Lead</p> <p></p> <p>Willy specializes in AI governance, AI risk management, AI assurance and ethics-by-design. She is an advocate of AI standards and a member of several ethics committees.</p> <p> FrieseWoudloper</p> <p> Willy Tadema</p>"},{"location":"adrs/0001-adrs/","title":"ADR-0001 ADRs","text":""},{"location":"adrs/0001-adrs/#context","title":"Context","text":"<p>In modern software development practices, the use of Architecture Decision Records (ADRs) has become increasingly common. ADRs are documents that capture important architectural decisions made during the development process. These decisions play a crucial role in guiding the development team and ensuring consistency and coherence in the architecture of the software system.</p>"},{"location":"adrs/0001-adrs/#assumptions","title":"Assumptions","text":"<ol> <li>ADRs provide a structured way to document and communicate architectural decisions.</li> <li>Publishing ADRs publicly fosters transparency and facilitates collaboration among team members and stakeholders.</li> <li>ADRs help in onboarding new team members by providing insights into past decisions and their rationale.</li> </ol>"},{"location":"adrs/0001-adrs/#decision","title":"Decision","text":"<p>We will utilize ADRs in our team to document and communicate architectural decisions effectively. Furthermore, we will publish these ADRs publicly to promote transparency and facilitate collaboration.</p>"},{"location":"adrs/0001-adrs/#template","title":"Template","text":"<p>Use the template below to add an ADR:</p> <pre><code># ADR-XXXX Title\n\n## Context\n\nWhat is the issue that we're seeing that is motivating this decision or change?\n\n## Assumptions\n\nAnything that could cause problems if untrue now or later. (optional)\n\n## Decision\n\nWhat is the change that we're proposing and/or doing?\n\n## Risks\n\nAnything that could cause malfunction, delay, or other negative impacts. (optional)\n\n## Consequences\n\nWhat becomes easier or more difficult to do because of this change?\n\n## More Information\n\nProvide additional evidence/confidence for the decision outcome\nLinks to other decisions and resources might here appear as well. (optional)\n</code></pre>"},{"location":"adrs/0002-code-platform/","title":"ADR-0002 Code Platform","text":""},{"location":"adrs/0002-code-platform/#context","title":"Context","text":"<p>In the landscape of software development, the choice of coding platform significantly impacts developer productivity, collaboration, and code quality. it's crucial to evaluate and select a coding platform that aligns with our development needs and fosters efficient workflows.</p>"},{"location":"adrs/0002-code-platform/#assumptions","title":"Assumptions","text":"<p>The following assumptions are made:</p> <ul> <li>Our work should be visible to external teams to promote transparency and facilitate collaboration.</li> <li>The coding platform should be easily available for developers.</li> <li>The coding platform should offers collaboration tools between developers and the community.</li> <li>The coding platform should offers security and dependency management tools.</li> <li>The pricing model should be suitable for our budget and needs. Currently meaning no budgets.</li> </ul>"},{"location":"adrs/0002-code-platform/#decision","title":"Decision","text":"<p>After careful consideration and evaluation of various options like GitHub, GitLab and BitBucket, we propose adopting GitHub as our primary coding platform. The decision is based on the following factors:</p> <p>Costs: There are currently no costs associate in using GitHub for our use cases.</p> <p>Features and Functionality: GitHub offers a comprehensive set of features essential for modern software development and collaboration with external teams, including version control, code review, issue tracking, continuous integration, and deployment automation.</p> <p>Security: GitHub offers a complete set of security features essential to secure development like dependency management and security scanning.</p> <p>Community and Ecosystem: GitHub boasts a vibrant community and ecosystem, facilitating knowledge sharing, collaboration, and access to third-party tools, and services that can enhance our development workflows. Within our organization we have easy access to the team managing the GitHub organization.</p> <p>Usability and User Experience: A user-friendly interface and intuitive workflows are essential for maximizing developer productivity and minimizing onboarding time. GitHub offers a streamlined user experience and customizable workflows that align with our team's preferences and practices.</p>"},{"location":"adrs/0002-code-platform/#risks","title":"Risks","text":"<p>Currently the organization of MinBZK on GitHub does not have a lot of <code>people</code> indicating that our team is an early adapter of the platform within the organization. This might impact our features due to cost constrains.</p>"},{"location":"adrs/0002-code-platform/#consequences","title":"Consequences","text":"<p>If we choose another tool in the future we need to migrate our codebase, and potentially need to rewrite some specific GitHub features that cannot be used in another tool.</p>"},{"location":"adrs/0002-code-platform/#more-information","title":"More Information","text":"<p>Alternatives considered:</p> <ul> <li>BitBucket</li> <li>GitLab</li> <li>Forgejo</li> </ul>"},{"location":"adrs/0003-ci-cd/","title":"ADR-0003 CI/CD Tooling","text":""},{"location":"adrs/0003-ci-cd/#context","title":"Context","text":"<p>Our development team wants to implement a CI/CD solution to streamline the build, testing, and deployment workflows of our software products. Currently, our codebase resides on GitHub, and we leverage Kubernetes as our chosen orchestration platform, managed by the DigiLab platform team.</p>"},{"location":"adrs/0003-ci-cd/#decision","title":"Decision","text":"<p>We will use the following tools for CI/CD pipeline:</p> <ul> <li>Continuous Integration (CI): GitHub Actions will be employed to facilitate the automated testing of our applications.</li> <li>Continuous Deployment (CD): We will utilize Flux for managing the deployment process of our applications. Flux reads from github to deploy.</li> </ul>"},{"location":"adrs/0003-ci-cd/#consequences","title":"Consequences","text":"<p>GitHub Actions aligns with our existing infrastructure, ensuring seamless integration with our codebase and minimizing operational overhead. GitHub Actions' specific syntax for CI results in vendor lock-in, necessitating significant effort to migrate to an alternative CI system in the future.</p> <p>Flux, being a GitOps operator for Kubernetes, offers a declarative approach to managing deployments, enhancing reliability and repeatability within our Kubernetes ecosystem.</p>"},{"location":"adrs/0004-software-hosting-platform/","title":"ADR-0004 Software hosting platform","text":""},{"location":"adrs/0004-software-hosting-platform/#context","title":"Context","text":"<p>Our team recognizes the necessity of a platform to run our software, as our local machines lack the capacity to handle certain workloads effectively. We have evaluated several options available to us:</p> <ol> <li>Digilab Kubernetes</li> <li>Logius Kubernetes</li> <li>SSC-ICT VMs</li> <li>ODC Noord</li> </ol>"},{"location":"adrs/0004-software-hosting-platform/#assumptions","title":"Assumptions","text":"<p>We operate under the following assumptions:</p> <ul> <li>High availability is not a critical requirement for our software.</li> <li>Our team prioritizes low maintenance solutions.</li> </ul>"},{"location":"adrs/0004-software-hosting-platform/#decision","title":"Decision","text":"<p>We will use Digilab Kubernetes for our workloads.</p>"},{"location":"adrs/0004-software-hosting-platform/#consequences","title":"Consequences","text":"<p>By choosing Digilab Kubernetes, we gain access to a namespace within their managed Kubernetes cluster. However, it's important to note that Digilab does not provide any guarantees regarding the availability of the cluster. Should our software require higher availability assurances, we may need to explore alternative solutions.</p>"},{"location":"adrs/0005-python-tooling/","title":"ADR-0005 Python coding standard and tools","text":""},{"location":"adrs/0005-python-tooling/#context","title":"Context","text":"<p>In modern software development, maintaining code quality is crucial for readability, maintainability, and collaboration. Python, being a dynamically typed language, requires robust tooling to ensure code consistency and type safety. Manual enforcement of coding standards is time-consuming and error-prone. Hence, adopting automated tooling to streamline this process is imperative.</p>"},{"location":"adrs/0005-python-tooling/#decision","title":"Decision","text":"<p>We will use these standards and tools for our own projects:</p> <ul> <li>Google style guide</li> <li>Ruff<ul> <li>Rules: [I, SIM, B, UP, F, E]</li> <li>Formatter</li> </ul> </li> <li>Pyright: A static type checker for Python, ensuring type safety and reducing potential runtime errors.</li> <li>pre-commit: A framework for managing and maintaining multi-language pre-commit hooks.</li> </ul> <p>Working with external projects these coding standards will not always be possible. but we will try to integrate them as much as possible.</p>"},{"location":"adrs/0005-python-tooling/#consequences","title":"Consequences","text":"<p>Improved Code Quality: Adoption of these tools will lead to improved code quality, consistency, and maintainability across the project.</p> <p>Enhanced Developer Productivity: Automated code formatting and static type checking will reduce manual effort and free developers to focus more on coding logic rather than formatting and type-related issues.</p> <p>Reduced Bug Incidence: Static typing and linting will catch potential bugs and issues early in the development process, reducing the likelihood of runtime errors and debugging efforts.</p> <p>Standardized Development Workflow: By integrating pre-commit hooks, the development workflow will be standardized, ensuring that all developers follow the same code quality standards.</p>"},{"location":"adrs/0006-agile-tooling/","title":"ADR-0006 Agile tooling","text":""},{"location":"adrs/0006-agile-tooling/#context","title":"Context","text":"<p>Our development team wants to enhance transparency and productivity in our software development processes. We are using GitHub for version control and collaboration. However, to further streamline our process, there is a need to incorporate tooling for managing the effort of our team.</p>"},{"location":"adrs/0006-agile-tooling/#decision","title":"Decision","text":"<p>We will use GitHub Projects as our agile process tool</p>"},{"location":"adrs/0006-agile-tooling/#consequences","title":"Consequences","text":"<p>GitHub Projects seamlessly integrates with our existing GitHub repositories, allowing us to manage our Agile processes. within the same ecosystem where our code resides. This integration eliminates the need for additional third-party tools, simplifying our workflow.</p>"},{"location":"adrs/0007-commit-convention/","title":"ADR-0007 Commit convention","text":""},{"location":"adrs/0007-commit-convention/#context","title":"Context","text":"<p>In software development, maintaining clear and consistent commit message conventions is crucial for effective collaboration, code review, and project management. Commit messages serve as a form of documentation, helping developers understand the changes introduced by each commit without having to analyze the code diff extensively.</p>"},{"location":"adrs/0007-commit-convention/#decision","title":"Decision","text":"<p>A commit message must follow the following rules:</p> <ol> <li>The subject line (first line) MUST not be no longer than 50 characters</li> <li>The subject line MUST be in the imperative mood</li> <li>A sentences MUST have Capitalized first word</li> <li>The subject line MUST not end with a punctuation</li> <li>The body line length SHOULD be restricted to 72 characters</li> <li>The body MUST be separate by a blank line from the subject line if used</li> <li>The body SHOULD be used to explain what and why, not how.</li> <li>The body COULD end with a ticket number</li> <li>The Subject line COULD include a ticket number in the following format</li> </ol> <p>\\&lt;ref&gt;-\\&lt;ticketnumber&gt;: subject line</p> <p>An example of a commit message:</p> <p>Fix foo to enable bar</p> <p>or</p> <p>AB-1234: Fix foo to enable bar</p> <p>or</p> <p>Fix foo to enable bar</p> <p>This fixes the broken behavior of component abc caused by problem xyz.</p> <p>If we contribute to projects not started by us we try to follow the above standard unless a specific convention is obvious or required by the project.</p>"},{"location":"adrs/0007-commit-convention/#consequences","title":"Consequences","text":"<p>In some repositories Conventional Commits are used. This ADR does not follow conventional commits.</p>"},{"location":"adrs/0008-architectural-diagram-tooling/","title":"ADR-0008 Architectural Diagram Tooling","text":""},{"location":"adrs/0008-architectural-diagram-tooling/#context","title":"Context","text":"<p>To communicate our designs in a graphical manner, it is of importance to draw architectural diagrams. For this we use tooling, that supports us in our work. We need to have something that is written so that it can be processed by both people and machine, and we want to have version control on our diagrams.</p>"},{"location":"adrs/0008-architectural-diagram-tooling/#decision","title":"Decision","text":"<p>We will write our architectural diagrams in Markdown-like (.mmmd) in the Mermaid Syntax to edit these diagrams one can use the various plugins. For each project where it is needed, we will add the diagrams in the repository of the subject. The level of detail we will provide in the diagrams is according to the C4-model metamodel on architecture diagramming.</p>"},{"location":"adrs/0008-architectural-diagram-tooling/#consequences","title":"Consequences","text":"<p>Standardized Workflow: By maintaining architecture as code, it will be standardized in our workflow.</p> <p>Version control on diagrams: By using version control, we will be able to collaborate easier on the diagrams, and we will be able to see the history of them.</p> <p>Diagrams are in .md format: By storing our diagrams next to our code, it will be where you need it the most.</p>"},{"location":"adrs/0010-container-registry/","title":"ADR-0010 Container Registry","text":""},{"location":"adrs/0010-container-registry/#context","title":"Context","text":"<p>Containers allow us to package and run applications in a standardized and portable way. To be able to (re)use and share images, they need to be stored in a registry that is accessible by others.</p> <p>There are many container registries. During research the following registries have been noted:</p> <p>Docker Hub, GitHub Container Registry, Amazon Elastic Container Registry (ECR), Azure Container Registry (ACR), Google Artifact Registry (GAR), Red Hat Quay, GitLab Container Registry, Harbor, Sonatype Nexus Repository Manager, JFrog Artifactory.</p>"},{"location":"adrs/0010-container-registry/#assumptions","title":"Assumptions","text":"<ul> <li>We do not want to host our own registry.</li> <li>The images we create can be kept private or publicly shared.</li> <li>For development and testing, images should be kept private to prevent accidental use of unfinished products.</li> <li>Images we provide are safe and secure. This means a container registry should have the option to (continuously) verify the security status of an image.</li> <li>By configuration, tags can be made immutable, to prevent image tags from being overwritten.</li> <li>The registry keeps logs of events regarding containers.</li> <li>The registry needs to have a Role Based Access model.</li> <li>No additional sign up is required to pull the image</li> <li>We can use a kubernetes authorization token to pull images.</li> <li>The registry has support for scheduled deletion of images by criteria.</li> </ul>"},{"location":"adrs/0010-container-registry/#decision","title":"Decision","text":"<p>We will use GitHub Container Registry.</p> <p>This aligns best with the previously made choices for GitHub as a code repository and CI/CD workflow.</p>"},{"location":"adrs/0010-container-registry/#risks","title":"Risks","text":"<p>Traditionally, Docker Hub has been the place to publish images. Therefore, our images may be more difficult to discover.</p> <p>The following assumptions are not (directly) covered by the chosen registry:</p> <ul> <li>Security scans are not implemented by default, meaning we should find another solution to cover this, for example by using a GitHub Action.</li> <li>Private packages are limited by space and an additional license may be required, see Billing for GitHub Packages.</li> <li>It is unclear if it is possible to overwrite tags.</li> <li>Removing images by criteria is not implemented by default, but could be solved using a GitHub Action.</li> </ul>"},{"location":"adrs/0010-container-registry/#consequences","title":"Consequences","text":"<p>By using GitHub Container Registry we have a container registry we can use both internally as well as share with others. This has low impact, we can always move to another registry since the Open Container Initiative is standardized.</p>"},{"location":"adrs/0010-container-registry/#more-information","title":"More Information","text":"<p>The following sites have been consulted:</p> <ul> <li>Bluelight 'How to choose a container registry'</li> <li>G2 container-registry</li> <li>slashdot container registries</li> <li>Sourceforge Container Registries</li> <li>G2 Alternative Registries</li> <li>Security controls for container registries</li> </ul>"},{"location":"adrs/0011-researcher-in-residence/","title":"ADR-0011 Researcher in Residence","text":""},{"location":"adrs/0011-researcher-in-residence/#context","title":"Context","text":"<p>The AI validation team works transparently. Working with public funds warrants transparency toward the public. Additionally, being transparent aligns with the team's mission of increasing the transparency of public organizations. In line with this reasoning, it is important to be open to researchers interested in the work of the AI validation team. Allowing researchers to conduct research within the team contributes to transparency and enables external perspectives and feedback to be incorporated into the team's work.</p>"},{"location":"adrs/0011-researcher-in-residence/#assumptions","title":"Assumptions","text":"<ul> <li>Having researchers in residence is a mechanism that facilitates transparency.</li> <li>Research enables external feedback and perspectives to be incorporated into the team's work.</li> <li>Having researchers in residence from other disciplines facilitates an interdisciplinary exchange of ideas in light of   interdisciplinary issues.</li> <li>Research outcomes enable knowledge exchange between the scientific community and public organizations, in the   Netherlands and abroad.</li> </ul>"},{"location":"adrs/0011-researcher-in-residence/#decision","title":"Decision","text":"<p>We have decided to include a researcher in residence as a member of our team.</p> <p>The researcher in residence takes the following form:</p> <ul> <li>They join and participate in refinement meetings and other meetings of relevance.</li> <li>They share their reflections and present their (interim) research findings.</li> <li>They are provided access to the communication channel of the team (Mattermost).</li> <li>They meet with a contact person bi-weekly to discuss questions and relevant progress and findings.</li> <li>The are independent as they are employed and financed by their respective university.</li> <li>The results of their research, and thus processed data, are published in an academic journal.</li> </ul> <p>The following conditions apply to the researcher in residence.</p> <ul> <li>The research is able to access and analyze documents relevant to the research (ex. Notes/minutes taken). These   documents are submitted to a member of the team for review prior to being processed.</li> <li>Any data collected and analyzed via interviews is done on the basis of informed consent.</li> <li>The data collected is not shared with other parties and is handled ethically by the researcher.</li> <li>No information, aggregation, or summary of information from the communication channel of the team (Mattermost) is   collected, processed, or analyzed by the researcher.</li> </ul>"},{"location":"adrs/0011-researcher-in-residence/#risks","title":"Risks","text":"<p>Risks around a potential chilling effect (team members not feeling free to express themselves) are mitigated by the conditions we impose. In light of aforementioned form and conditions above, we see no further significant risks.</p>"},{"location":"adrs/0011-researcher-in-residence/#consequences","title":"Consequences","text":"<p>Including a researcher in residence makes it easier for them to conduct research within both the team and the wider organization where the AI validation team operates. This benefits the quality of the research findings and the feedback provided to the team and organization.</p>"},{"location":"adrs/0012-dictionary-for-spelling/","title":"ADR-0012 Dictionary for spelling","text":""},{"location":"adrs/0012-dictionary-for-spelling/#context","title":"Context","text":"<p>We use English as language in some of our external communications, like on GitHub. We noticed that among different documents certain words are spelled correctly but differently, depending on the author or dictionary used. Also there are occasional typos which can cause distraction and don't meet professional standards.</p>"},{"location":"adrs/0012-dictionary-for-spelling/#assumptions","title":"Assumptions","text":"<p>Standardizing the used dictionary avoids discussion on spelling and makes documents consistent. Eliminating typos contributes to professional, credible and unambiguous documents.</p> <p>Using a dictionary in a pre-commit hook will prevent commits being made with obvious spelling issues.</p>"},{"location":"adrs/0012-dictionary-for-spelling/#decision","title":"Decision","text":"<p>We will use the U.S. English spelling dictionary.</p>"},{"location":"adrs/0012-dictionary-for-spelling/#risks","title":"Risks","text":"<p>It may slow down committing large files.</p>"},{"location":"adrs/0012-dictionary-for-spelling/#consequences","title":"Consequences","text":"<p>Documents will all use the same dictionary for spelling and will not contain typos.</p>"},{"location":"adrs/0013-date-time-representation/","title":"ADR-0013 Date Time Representation: ISO 8601","text":""},{"location":"adrs/0013-date-time-representation/#context","title":"Context","text":"<p>In our software development projects, we have encountered ambiguity related to the representation of dates and times, particularly when dealing with time zones. The lack of a standardized approach has led to discussions and possibly ambiguity when interpreting timestamps within our applications.</p>"},{"location":"adrs/0013-date-time-representation/#assumptions","title":"Assumptions","text":"<p>Standardizing the representation of dates and times will improve clarity and precision in our application's logic and user interfaces.</p> <p>ISO 8601 format is better human-readable than other formats such as unix timestamps.</p>"},{"location":"adrs/0013-date-time-representation/#decision","title":"Decision","text":"<p>We adopt ISO 8601 with timezone notation, preferably in UTC (<code>Z</code>), as the standard method for representing dates and times in our software projects, replacing the usage of Unix timestamps or any other formats or timezones. We use both dashes (<code>-</code>) and colons (<code>:</code>).</p> <p>We store date and time as: <code>2024-04-16T16:48:14Z</code> (preferably with <code>Z</code> as timezone, representing UTC)</p> <p>We store dates as <code>2024-04-16</code>.</p> <p>Only when capturing client events we may want to choose to store the client timezone instead of UTC.</p> <p>When rendering a date and time in a user interface, we may want to localize the date and time for the appropriate timezone.</p>"},{"location":"adrs/0013-date-time-representation/#risks","title":"Risks","text":"<p>Increased storage space: ISO 8601 representations can be longer than other formats, leading to potential increases in storage requirements, especially when dealing with large datasets.</p>"},{"location":"adrs/0013-date-time-representation/#consequences","title":"Consequences","text":"<p>A single ISO 8601 with UTC timezone provides a clear and unambiguous way to represent dates and times. Its format is easily recognizable and eliminates the need for interpretation. For example: <code>2024-04-15T10:00:00Z</code> can easily be understood without needing to parse it using a library.</p> <p>We will need to regularly convert from localized time to UTC and back when capturing, storing, and rendering dates and times.</p>"},{"location":"adrs/0013-date-time-representation/#more-information","title":"More Information","text":"<p>ISO 8601 is an internationally recognized standard endorsed by the International Organization for Standardization (ISO). Its adoption offers numerous benefits, including improved clarity, global accessibility, and future-proofing of systems and applications.</p> <p>For further reading on ISO 8601:</p> <ul> <li>Forum Standaardisatie - Datum en Tijd</li> <li>ISO 8601-1:2019 - Date and Time Format</li> <li>ISO 8601 - Wikipedia</li> </ul>"},{"location":"adrs/0014-written-language/","title":"ADR-0014 Written Language","text":""},{"location":"adrs/0014-written-language/#context","title":"Context","text":"<p>In order to expand our reach and foster international collaboration in the field of AI Validation, we have decided to conduct all communication in English on public platforms such as GitHub. This decision aims to facilitate better understanding and participation from our global colleagues. However, within the Government of the Netherlands, the norm is to communicate in Dutch for internal purposes. This ADR will provide guidelines on which language to use for different types of communications.</p>"},{"location":"adrs/0014-written-language/#assumptions","title":"Assumptions","text":"<p>There is no requirement to use Dutch as the primary language for all our activities while working for the Government of the Netherlands. More information can be found in the More Information section.</p>"},{"location":"adrs/0014-written-language/#decision","title":"Decision","text":"<p>The following channels will utilize English:</p> <ul> <li>GitHub projects</li> <li>GitHub repository</li> <li>Email to international partners</li> </ul> <p>The primary language for the following channels will be Dutch:</p> <ul> <li>Mattermost (internal collaboration tool)</li> <li>Emails to internal parties</li> <li>Official internal documents like memo's or notes to the house of representatives</li> <li>Guides for Dutch users on how to use the tools</li> <li>UI for the tools we will make</li> </ul>"},{"location":"adrs/0014-written-language/#risks","title":"Risks","text":"<p>Dutch-only developers will have a harder time following along with the progression of our team on both the code on GitHub as our Project Management.</p>"},{"location":"adrs/0014-written-language/#consequences","title":"Consequences","text":"<ul> <li>All tickets and issues will be written in English on GitHub projects.</li> <li>Code reviews will be written in English.</li> <li>Comments in the code and commit messages will be written in English.</li> <li>Documentation of the tools will be written in both Dutch and English.</li> </ul>"},{"location":"adrs/0014-written-language/#more-information","title":"More Information","text":"<p>Although many attempts by previous cabinets, Dutch is not the official language in the Netherlands according to the Dutch constitution. See the following link.</p> <p>According to the website of the Government of the Netherlands the Dutch language is the official recognized language. This means that in combination with the law <code>Algemene wet bestuursrecht</code> on wetten.overheid.nl governing bodies and their employees need to communicate in Dutch unless stated differently elsewhere. It is stated here that communicating in another language than Dutch is permitted if the goal of communicating in another language than Dutch is sufficiently justified and if other parties are not effected disproportionately by the usage of another language.</p>"},{"location":"adrs/0016-government-cloud-comparison/","title":"ADR-0016 Government Cloud Comparison","text":""},{"location":"adrs/0016-government-cloud-comparison/#context","title":"Context","text":"<p>Right now we have a few organizations (Logius, SSC-ICT, ODC-Noord, Tender process, and Digilab, etc...) offering IT infrastructure. This ADR will give an overview of what these different organizations are offering as well as make a decision for the AI Validation team on which infrastructure provider we will focus.</p>"},{"location":"adrs/0016-government-cloud-comparison/#descriptions-and-comparison","title":"Descriptions and comparison","text":"<ul> <li>SSC-ICT<ul> <li>Description:<ul> <li>SSC-ICT is an ICT service provider for some ministries of the government of the Netherlands. In the   service catalogue of 2024   no mention of Kubernetes is in the document. They are specialized in workplace management but through an   NSK(not standard client request) extra services could be provided by them.</li> </ul> </li> <li>Pros:<ul> <li>The integration with the RON (Rijksoverheid Network) is managed well because that is a service that SSC-ICT also manages.</li> </ul> </li> <li>Cons:<ul> <li>They are known to be very bureaucratic, a standard NSK can take up at least half a year and then still you don't have what you want.</li> </ul> </li> </ul> </li> <li>Standaard Platform (Logius)<ul> <li>Description:<ul> <li>The standard Platform of Logius will provide an Openshift Kubernetes namespace for you</li> </ul> </li> <li>Pros:</li> <li>Cons:</li> </ul> </li> <li>ODC-Noord<ul> <li>Description:<ul> <li>ODC-Noord provides multiple services, on one hand it can provide:         - Platform as a Service, with this service you can set-up Virtual machines with specific open source software packages. However, this is not a scalable service, as you are limited to quotas you have on a project and limits of the virtual instance.         - Another service is Infrastructure as a Service, here you can set-up anything you want on Openshift.         - There are several specialized services on ODC-Noord as well for development street or data science but these are not suitable for hosting a custom application.</li> </ul> </li> <li>Pros:<ul> <li>It is another governmental party which makes communication and commitment easier.</li> </ul> </li> <li>Cons:<ul> <li>ODC-Noord stated that they will not invest in GPUs, which would limit our Machine Learning Jobs potential.</li> <li>The Platform as a Service is less scalable then we would like to see.</li> </ul> </li> </ul> </li> <li>Digilab<ul> <li>Description:<ul> <li>Digilab will provide an Openshift Kubernetes namespace for you, but also managed services like Mattermost.</li> </ul> </li> <li>Pros:<ul> <li>The platform is made such based on the vision of Common Ground, and thus to standardize cloud hosting through Haven for all Dutch municipalities. This standardization improves on integration later on.</li> </ul> </li> <li>Cons:</li> </ul> </li> <li>Tender Process Aanbestedingswet<ul> <li>Description:<ul> <li>If you don't want to make use of the governmental parties stated above you could go to the free market to provide infrastructure for you. As the government cannot simply find a party to implement this for them, you need to go through a tender process as described in the law stated above in the title.</li> </ul> </li> <li>Pros:<ul> <li>The process of acquiring this is open and transparent for everybody.</li> </ul> </li> <li>Cons:<ul> <li>Takes a while as generally just like with SSC-ICT you need to write a whole set of documents to specify what you exactly want and you cannot change this in the meantime.</li> </ul> </li> </ul> </li> <li>SLM Rijk<ul> <li>Description:<ul> <li>The Rijksoverheid has made Strategic Delivery Agreement that with certain restrictions public cloud providers can be used by the Dutch Government.</li> </ul> </li> <li>Pros:<ul> <li>It is very easy to set-up infrastructure with the big international cloud providers.</li> </ul> </li> <li>Cons:<ul> <li>With our team we decided that we prefer open source solutions. principles.md. So if we use some managed solutions of the big cloud providers we would not be following our principles.</li> </ul> </li> </ul> </li> <li>DICTU<ul> <li>Description:<ul> <li>DICTU is a governmental party which can will develop custom managed solutions on their own cloud, DICTU is part of the Ministry of Economic Affairs but is also available for other ministries. It is rumoured that just like other parties it could provide just some Kubernetes namespaces for you.</li> </ul> </li> <li>Pros:<ul> <li>They have an impressive track record and can deliver production ict services well.</li> </ul> </li> <li>Cons:<ul> <li>You need to do stakeholder management if you make use of their services instead of changing the infrastructure yourself.</li> </ul> </li> </ul> </li> <li>ICTU<ul> <li>Description:<ul> <li>ICTU is like DICTU also a governmental party, but then exists by law instead of under a ministry.</li> </ul> </li> <li>Pros:<ul> <li>They have an impressive track record and can deliver production ict services well.</li> </ul> </li> <li>Cons:<ul> <li>You need to do stakeholder management if you make use of their services instead of changing the infrastructure yourself.</li> </ul> </li> </ul> </li> </ul> <p>Please see the following picture for an overview of the providers in relation to what they can provide, currently we are heavily searching in the realm of unmanaged infrastructure, as we want this to manage ourselves.</p> <p></p>"},{"location":"adrs/0016-government-cloud-comparison/#decision","title":"Decision","text":"<p>For our infrastructure provider we decided to go with Digilab as the main source, as they can provide us with a Kubernetes namespace and are a reliable and convenient partner as we work closely with them.</p>"},{"location":"adrs/0016-government-cloud-comparison/#risks","title":"Risks","text":"<p>Certain choices are made for us if we make use of the Kubernetes namespace of Digilab, for example that we need to make use of Flux for our CI/CD pipeline.</p>"},{"location":"adrs/0016-government-cloud-comparison/#extra-information","title":"Extra information","text":"<ul> <li>Standaard Platform<ul> <li>Internal Document</li> </ul> </li> </ul>"},{"location":"projects/llm-benchmarks/","title":"LLM Benchmarks","text":""},{"location":"projects/llm-benchmarks/#context","title":"Context","text":"<p>Large Languages Models (LLMs) are becoming increasingly popular in assisting people in a variety of tasks. These tasks include, but are not limited to, information retrieval, assisting with coding and essay writing. In the context of the government, tasks can include for example supporting Freedom of Information Act (FOIA) requests and aiding in answering questions of citizens.</p> <p>While the potential benefit of using LLMs is large, there are also significant risks. Basically an LLM is just a next token predictor, which bases its predictions on the user input (context) and on compressed information seen during training (LLM parameters); hence there is no guarantee on the quality and correctness of the output. Moreover, due to bias in the training data, LLMs can have bias in their output, despite best efforts to mitigate this. Additionally, we have human values that we expect LLMs to be aligned with. Certainly, within the context of a government, we should take utmost care not to discriminate. To assess the quality, correctness, bias and alignment with human values of an LLM one can perform benchmarks.</p>"},{"location":"projects/llm-benchmarks/#the-project","title":"The project","text":"<p>The LLM Benchmarks project of the AI Validation Team aims to create a platform where LLMs can be measured across a wide range of benchmarks. We limit ourselves to LLMs and benchmarks that are related to the Dutch society. Both LLMs and the benchmarks can be configured by users of the platform. Users can run these benchmarks on LLMs on our platform. The intended goal of this project is to give government organizations, citizens and companies insight in the various LLMs and their quality, correctness, bias and alignment with human values. The project also encompasses a dashboard with uploaded LLMs and their performance on uploaded benchmarks. With this platform we aim to enhance public trust in the usage of LLMs and expose potential bias that exists within LLMs.</p>"},{"location":"projects/tad/","title":"TAD","text":"<p>TAD is the acronym for Transparency of Algorithmic Decision making. TAD has the goal to make algorithmic systems more transparent; it achieves this by generating standardized reports on the algorithmic system which encompasses both technical aspects in addition to descriptive information about the system and regulatory assessments. For both the system and the model the lifecycle is important and this needs to be taken into account. The definition for an algorithm is derived from the Algoritmeregister.</p> <p>One of the goals of the TAD project is providing a standardized format of reporting on a algorithmic system by developing a Reporting Standard. This Reporting Standard consists out of a System Card which contains Model Cards and Assessment Cards.</p> <p>The final result of the project is producing System, Model and Assessment Cards with both performance metrics and technical measurements on fairness and bias of the model, assessments on the system where the specific algorithm resides, and descriptive information about the system.</p> <p>The requirements and instruments are dictated by the Algoritmekader.</p>"},{"location":"projects/tad/comparison/","title":"Comparison of Reporting Standards","text":"<p>This document assesses standards that standardize the way algorithm assessments can be captured.</p>"},{"location":"projects/tad/comparison/#background","title":"Background","text":"<p>There are many algorithm assessments (e.g. IAMA, HUIDERIA, etc.), technical tests on performance (e.g. Accuracy, TP, FP, F1, etc), fairness and bias of algorithms (e.g. SHAP) and reporting formats available. The goal is to have a way of standardizing the way these different assessments and tests can be captured.</p>"},{"location":"projects/tad/comparison/#available-standards","title":"Available standards","text":""},{"location":"projects/tad/comparison/#model-cards","title":"Model Cards","text":"<p>The most interesting existing capturing methods seem to be all based on Model Cards for Model Reporting, which are:</p> <p>\"Short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information\", proposed by Google. Note that \"The proposed set of sections\" in the Model Cards paper \"are intended to provide relevant details to consider, but are not intended to be complete or exhaustive, and may be tailored depending on the model, context, and stakeholders.\"</p> <p>Many companies implement their own version of Model Cards, for example Meta System Cards and the tools mentioned in the next section.</p>"},{"location":"projects/tad/comparison/#automatic-model-card-generation","title":"Automatic model card generation","text":"<p>There exist tools to (semi)-automatically generate models cards:</p> <ol> <li>Model Card Generator by US Sensus Bureau. Basic UI to create model cards and export to markdown, also has a command line tool.</li> <li>Model Card Toolkit by Google. Automation only comes from integration with TensorFlowExtended and ML Metadata.</li> <li>VerifyML. Based on the Google toolkit, but is extended to include specific tests on fairness and bias. Technical tests can be added by users and model card schema (in protobuf) can be extended by users.</li> <li>Experimental Model Cards Tool by Hugging Face. This is the implementation of the Google paper by Hugging Face and provides information on the models available on their platform. The writing tools guides users through their model card and allows for up- and downloading from and to markdown.</li> </ol>"},{"location":"projects/tad/comparison/#other-standards","title":"Other standards","text":"<p>A landscape analysis of ML documentation tools has been performed by Hugging Face and provides a good overview of the current landscape.</p> <p>Another interesting standard is the Algorithmic Transparency Recording Standard of the United Kingdom Government, which can be found here.</p>"},{"location":"projects/tad/comparison/#proposal","title":"Proposal","text":"<p>We need a standard that captures algorithmic assessments and technical tests on model and datasets. The idea of model cards can serve as a guiding theoretical principle on how to implement such a standard. More specifically, we can draw inspiration from the existing model card schema's and implementations of VerifyML and Hugging Face. We note the following:</p> <ol> <li>None of these two standards capture algorithmic assessments.</li> <li>Only VerifyML has a specific format to capture some technical tests.</li> </ol> <p>Hence in any case we need to extend one of these standards. We propose to:</p> <ol> <li>Assess and compare these two standards</li> <li>Chose the most appropriate one to extend</li> <li>Extend (and possibly adjust) this standard to our own standard (in the form of a basic schema) that allows for capturing algorithmic assessments and standardizes the way technical tests can be captured.</li> </ol>"},{"location":"projects/tad/adrs/0001-adrs/","title":"TAD-0001 ADRs","text":""},{"location":"projects/tad/adrs/0001-adrs/#context","title":"Context","text":"<p>In modern software development practices, the use of Architecture Decision Records (ADRs) has become increasingly common. ADRs are documents that capture important architectural decisions made during the development process. These decisions play a crucial role in guiding the development team and ensuring consistency and coherence in the architecture of the software system.</p>"},{"location":"projects/tad/adrs/0001-adrs/#assumptions","title":"Assumptions","text":"<ol> <li>ADRs provide a structured way to document and communicate architectural decisions.</li> <li>Publishing ADRs publicly fosters transparency and facilitates collaboration among team members and stakeholders.</li> <li>ADRs help in onboarding new team members by providing insights into past decisions and their rationale.</li> </ol>"},{"location":"projects/tad/adrs/0001-adrs/#decision","title":"Decision","text":"<p>We will utilize ADRs in this project repository and communicate architectural decisions effectively. Furthermore, we will publish these ADRs publicly to promote transparency and facilitate collaboration.</p>"},{"location":"projects/tad/adrs/0001-adrs/#template","title":"Template","text":"<p>Use the template below to add an ADR:</p> <pre><code># TAD-XXXX Title\n\n## Context\n\nWhat is the issue that we're seeing that is motivating this decision or change?\n\n## Assumptions\n\nAnything that could cause problems if untrue now or later. (optional)\n\n## Decision\n\nWhat is the change that we're proposing and/or doing?\n\n## Risks\n\nAnything that could cause malfunction, delay, or other negative impacts. (optional)\n\n## Consequences\n\nWhat becomes easier or more difficult to do because of this change?\n\n## More Information\n\nProvide additional evidence/confidence for the decision outcome\nLinks to other decisions and resources might here appear as well. (optional)\n</code></pre>"},{"location":"projects/tad/adrs/0002-tad-reporting-standard/","title":"TAD-0002 TAD Reporting Standard","text":""},{"location":"projects/tad/adrs/0002-tad-reporting-standard/#context","title":"Context","text":"<p>The TAD Reporting Standard proposes a standardized way of capturing information of ML-models and systems.</p>"},{"location":"projects/tad/adrs/0002-tad-reporting-standard/#assumptions","title":"Assumptions","text":"<p>There is no existing standard of capturing all relevant information on ML-models that also includes fairness and bias tests and regulatory assessments.</p> <p>A widely used implementation for Model Cards for Model Reporting is given by the Hugging Face Model Card metadata specification, which in turn is based on Papers with Code Model Index. This implementation does not capture sufficient details about metrics and does not include measurements from technical tests on bias and fairness or regulatory assessments.</p>"},{"location":"projects/tad/adrs/0002-tad-reporting-standard/#decision","title":"Decision","text":"<p>We decided to implement a custom reporting standard. Our reporting standard can be split up into three elements.</p> <ol> <li>System Card, containing information about a group of ML-models which accomplish a specific task. A System Card can refer to multiple Model Cards, either a Model Card specified by the TAD Reporting Standard, or any other model card. A System Card can refer to multiple Assessment Cards.</li> <li>Model Card, containing information about a specific ML-model.</li> <li>Assessment Card, containing information about a regulatory assessment.</li> </ol> <p>We were heavily inspired by the Hugging Face Model Card metadata specification, which we essentially extended to allow for:</p> <ol> <li>More fine-grained information on performance metrics.</li> <li>Capturing additional measurements on fairness and bias.</li> <li>Capturing regulatory assessments.</li> </ol> <p>The extension is not strict, meaning that there the TAD Reporting Standard is not a valid Hugging Face metadata specification. The reason for this is that some fields in the Hugging Face standard are too much intertwined with the Hugging Face ecosystem and it would not be logical for us to couple our implementation this tightly to Hugging Face.</p>"},{"location":"projects/tad/adrs/0002-tad-reporting-standard/#risks","title":"Risks","text":"<p>The TAD Reporting Standard is not fully backwards compatible with the Hugging Face Model Card metadata specification. If in the future the Hugging Face Model Card metadata specification becomes a standard, we might need to revise the TAD standard.</p>"},{"location":"projects/tad/adrs/0002-tad-reporting-standard/#consequences","title":"Consequences","text":"<p>The TAD Reporting Standard allows us to capture relevant information on model performance, bias and fairness and regulatory assessments in a standardized way.</p>"},{"location":"projects/tad/adrs/0003-tad-tool/","title":"TAD-0003 Tool for Transparency of Algorithmic Decision making","text":""},{"location":"projects/tad/adrs/0003-tad-tool/#context","title":"Context","text":"<p>We are considering tooling for organizations to get more grip on their algorithms. Tooling for, for instance bias and fairness tests, and assessments (like IAMA).</p> <p>Transparency, we think, can be fostered by sharing reports from such a tool in a standardized way.</p> <p>There are several existing open source tools which we have assessed. Some support only assessments, others already combine more features and can generate a report. There is however no tool that supports all the requirements we have.</p> <p>These are our main requirements of our tool:</p> <ul> <li>it offers a one-stop shop for all aspects of the project for all stakeholders for all tasks.</li> <li>it supports a workflow to track different stages of the project, it should support lifecycle management.</li> <li>it can run many technical tests on a model.</li> <li>it supports filling out assessments.</li> <li>it supports capturing discussion and collaboration around technical tests and assessments,   with features like f.e. mentions, (email) notifications and status updates.</li> <li>it offers UI access to the above requirements.</li> <li>it can save results to a system card (or cards supported by system cards, like model, metrics and assessment).</li> <li>it can commit the different cards to a VCS, such as git, to allow for an audit trail.</li> <li>it supports a multilingual user interface, initially in Dutch and in the future Frisian.</li> <li>it supports multiple projects with one or multiple algorithms.</li> </ul>"},{"location":"projects/tad/adrs/0003-tad-tool/#assumptions","title":"Assumptions","text":"<ul> <li>Collaborating or extending another project will not give us the tool we are looking for.</li> <li>We can reuse certain components, like the   plugins from AIVerify, or existing libraries,   for technical tests.</li> <li>The tool will use a design based on loose coupled modules. This can be done by scanning directories,   to gather modules that implement a certain interface.</li> <li>Plugins will have to implement an interface to be picked up by the system.</li> <li>We can, to some extend, re-use the already existing POCs or findings from these POCs.</li> </ul>"},{"location":"projects/tad/adrs/0003-tad-tool/#decision","title":"Decision","text":"<p>We will build our own solution. Where possible this solution should be able to re-use certain components of other related open-source projects.</p>"},{"location":"projects/tad/adrs/0003-tad-tool/#risks","title":"Risks","text":"<ul> <li>It is a lot of work to create our own tool.</li> <li>We may not have sufficient knowledge of existing technical tests to incorporate them successfully.</li> <li>We may struggle to get uptake of the tool if we are not aligned with envisioned users of the tool.</li> </ul>"},{"location":"projects/tad/adrs/0003-tad-tool/#consequences","title":"Consequences","text":"<p>We can develop a solution that is tailored to the needs of our stakeholders.</p>"},{"location":"projects/tad/adrs/0004-software-stack/","title":"TAD-0004 Software Stack for TAD","text":""},{"location":"projects/tad/adrs/0004-software-stack/#context","title":"Context","text":"<p>For building our own TAD solution, we need to choose a software stack. During our earlier POCs and market research, we gathered insight and information on technologies to use and which not to use.</p> <p>During further discussions and brainstorm sessions, a software stack was chosen that accommodates our needs best.</p> <p>While more fine grained requirements are listed elsewhere, some key requirements are:</p> <ul> <li>The tool is locally runnable with docker.</li> <li>The tool is runnable as user as a local Docker solution.</li> <li>The tool is runnable on a cloud platform based on Kubernetes.</li> <li>The tool supports multiple organizations, teams and projects.</li> <li>The tool supports Oauth2.</li> </ul>"},{"location":"projects/tad/adrs/0004-software-stack/#assumptions","title":"Assumptions","text":"<p>We stick to suitable programming languages. As most AI related tooling is written in Python, this language is the logical choice for our development as well.</p> <p>Currently we do not see the need for a separate web GUI framework. it is preferred to bundle backend and frontend in one solution.</p> <p>As part of a Dutch government organization, we need to adhere to all dutch laws and standards, like:</p> <ul> <li>Digitoegankelijk</li> <li>WCAG Guidelines</li> <li>Forum Standaardisatie</li> </ul>"},{"location":"projects/tad/adrs/0004-software-stack/#decision","title":"Decision","text":""},{"location":"projects/tad/adrs/0004-software-stack/#programming-language","title":"Programming language","text":"<p>We will support the latest 3 minor version of Python v3 as programming language and Poetry for dependency management.</p>"},{"location":"projects/tad/adrs/0004-software-stack/#backend","title":"Backend","text":"<p>The Python backend will use the following key dependencies:</p> <ul> <li>Granian as HTTP server.</li> <li>Pydantic for data validation.</li> <li>FastAPI for REST/API/HTML communication.</li> <li>Jinja2 for templating.</li> <li>i18n Extension for multilingual support  with gettext or Babel.</li> </ul>"},{"location":"projects/tad/adrs/0004-software-stack/#frontend","title":"Frontend","text":"<p>We will use serverside rendering of HTML, based on HTMX. For styling and components we will use NL Design System.</p>"},{"location":"projects/tad/adrs/0004-software-stack/#testing","title":"Testing","text":"<p>We will use pytest for unit-testing and VCRPY and Playwright for module and integration tests.</p>"},{"location":"projects/tad/adrs/0004-software-stack/#database","title":"Database","text":"<p>We will use SQLModel or SQL Alchemy with SQLite for development and postgreSQL for production.</p>"},{"location":"projects/tad/adrs/0004-software-stack/#risks","title":"Risks","text":"<p>As HTMX is relatively more limited than other UI frameworks, it may lack features we require but did not anticipate.</p>"},{"location":"projects/tad/adrs/0004-software-stack/#consequences","title":"Consequences","text":"<p>We have clarity about the tools to use and develop our TAD tool.</p>"},{"location":"projects/tad/adrs/0005-ai-verify-technical-tests/","title":"TAD-0005 Add support to run technical tests via AI Verify","text":""},{"location":"projects/tad/adrs/0005-ai-verify-technical-tests/#context","title":"Context","text":"<p>The AI Verify project is set up in a modular way, and the technical tests is one of the modules. The AI Verify team is developing a feature which makes it possible to run the technical tests using an API: a Python library with a method to run a test and providing the required configuration; for example, which  model and dataset to use and some test specific configuration.</p> <p>The result of the test are returned in a JSON format, which can be processed in any way we please, like writing it to a file or System Card or store it in a database.</p>"},{"location":"projects/tad/adrs/0005-ai-verify-technical-tests/#pros","title":"Pros","text":"<ul> <li>We have several technical tests we can use of the shelf, like SHAP, ALE, or Fairness metrics.</li> <li>Tests are set up in a generic way using interfaces which allows others, like ourselves, to create and add  their own plugins.</li> <li>Loading models, pipelines and data is done through the AI Verify toolkit, which not only streamlines this process  but also performs validation and support checks.</li> <li>We benefit from new tests being added in the future.</li> </ul>"},{"location":"projects/tad/adrs/0005-ai-verify-technical-tests/#cons","title":"Cons","text":"<ul> <li>The outcome of an AI Verify test depends on the implementation chosen by the developer of the  plugin. This means there is no control or flexibility over what data (metrics, logs etc.) is  included in the output.</li> <li>Adding our own plugins may require adding AI Verify frontend components we don't use ourselves.</li> <li>We are dependent on the AI Verify ecosystem for supported models and data formats. However, they are  (like us) planning to expand model support to also include ONNX and we can contribute ourselves to support  a wider range of models and data formats.</li> <li>Running pipeline tests requires familiarity with the toolkit's pipeline handling mechanisms.</li> </ul>"},{"location":"projects/tad/adrs/0005-ai-verify-technical-tests/#assumptions","title":"Assumptions","text":"<ul> <li>We can wrap the API and other AI Verify requirements in a Docker image.</li> <li>We can run the Docker image independently where we only have to provide the model, datasets and other  required configuration to run a test.</li> <li>We can monitor the progress of a test in our TAD tool.</li> <li>We can process the results of a test in our TAD tool.</li> </ul>"},{"location":"projects/tad/adrs/0005-ai-verify-technical-tests/#decision","title":"Decision","text":"<p>Our technical tests will include, but may extend beyond, those offered by AI Verify.</p>"},{"location":"projects/tad/adrs/0005-ai-verify-technical-tests/#risks","title":"Risks","text":"<p>The tests we use from AI Verify are tied to the AI Verify ecosystem. So it uses their (core) modules to load models and datasets. Adding support for other models or data formats, like models written in R, has to be done in the AI Verify core.</p>"},{"location":"projects/tad/adrs/0005-ai-verify-technical-tests/#consequences","title":"Consequences","text":"<p>We have a set of technical tests we can integrate in the TAD tool.</p>"},{"location":"projects/tad/adrs/0006-extend-system-card-EU-AI-Act/","title":"TAD-0006 Include EU AI Act into System Card","text":""},{"location":"projects/tad/adrs/0006-extend-system-card-EU-AI-Act/#context","title":"Context","text":"<p>The European Union AI Act represents a landmark regulatory framework aimed at ensuring the safe and ethical development and deployment of artificial intelligence technologies within the EU. It defines different policies and requirements for AI systems based on their risk levels, from minimal to unacceptable, to mitigate potential harms. Only for high-risk AI systems, an extended form of documentation is required, including technical documentation. This technical documentation consists of a general description of the AI system and a more detailed, in-depth description (including risk-management, monitoring, etc.).</p> <p>To ensure that AI systems can be effectively audited, we aim to create a separate instrument called 'technical documentation for high-risk AI systems'. This will allow developers to easily extract and auditors to readily assess all necessary information for the technical documentation.</p> <p>The RegCheck AI tool published by Hugging Face, checks model cards for compliance with the EU AI Act. However, this prototype tool is research work and not a commercial or legal product. Furthermore, because we use a modified model card setup, the performance may be less reliable.</p>"},{"location":"projects/tad/adrs/0006-extend-system-card-EU-AI-Act/#assumptions","title":"Assumptions","text":"<ul> <li>There is no existing standard for including information on the EU AI Act for high-risk AI systems into a system card.</li> <li>We assume that the EU AI Act is about a whole AI system, that can include multiple AI models.</li> </ul>"},{"location":"projects/tad/adrs/0006-extend-system-card-EU-AI-Act/#decision","title":"Decision","text":"<ul> <li>We include the general description cases of the EU AI Act for high-risk systems into the system card directly.</li> <li>We create a separate instrument including the complete technical documentation into the instrument registry.</li> </ul>"},{"location":"projects/tad/adrs/0006-extend-system-card-EU-AI-Act/#risks","title":"Risks","text":"<ul> <li>In the case of a high-risk algorithm, the general information can be found in two places, the system card itself and in the assessment card.</li> <li>The system card may become too elaborate when we include the general description fields.</li> </ul>"},{"location":"projects/tad/adrs/0006-extend-system-card-EU-AI-Act/#consequences","title":"Consequences","text":"<p>The extended system card and proposed instrument will facilitate the documentation of information in accordance with the EU AI Act using the TAD tool.</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/","title":"ALTAI","text":"<p>See the introduction. It is a discussion tool about AI Systems.</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#functionality","title":"Functionality","text":"Requirement Priority Fulfilled Comments The tool allows users to conduct technical tests on algorithms or models, including assessments of performance, bias, and fairness. To facilitate these tests, users can input relevant datasets, M 0 The tool only allows for discussions not technical tests The tool allows users to choose which tests to perform. M 0 See above The tool allows users to fill out questionnaires to conduct impact assessments for AI. For example IAMA or ALTAI. M 1 This is very well supported by the tool The tool can generate a human readable report. M 0.9 There is an export functionality for the outcomes of the assessment, it offers a print dialog The tools works with a standardized report format, that it can read, write, and update. M 0 This report cannot be re-imported in a different tool as it only exports to pdf The tool supports plugin functionality so additional tests can be added easily. S 0 Not applicable The tool allows to create custom reports based on components. S 0 The report cannot be customized by the user It is possible to add custom components for reports. S 0 See above The tool provides detailed logging, including tracking of different model versions, changes in impact assessments, and technical test results for individual runs. S 0.75 There is even for the users an extensive audit trail what happened to assessment, not different model versions The tool supports saving progress. S 1 Yes this is supported The tool can be used on an isolated system without an internet connection. S 1 Yes it can be ran locally or in a docker container without internet The tool offers options to discuss and document conversations. For example, to converse about technical tests or to collaborate on impact assessments. C 1 This is the main feature of the tool The tool operates with complete data privacy; it does not share any data or logging information. C 1 Stored locally in a mongoDB The tool allows extension of report formats functionality. C 0.5 It could be developed that we export to markdown instead of pdf, but right now it just prints the window as pdf The tool can be integrated in a CI/CD flow. C 0 It is an UI tool, so doesn't make sense in a CI/CD pipeline The tool can be offered as a (cloud) service where no local installation is required. C 1 We could host this tool for other parties to use It is possible to define and automate workflows for repetitive tasks. C 0 It is an UI tool The tool offers pre-built connectors or low-code/no-code integration options to simplify the integration process. C 0 No <p>total_score = 22.85</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#reliability","title":"Reliability","text":"Requirement Priority Fulfilled Comments The tool operates consistently and reliably, meaning it delivers the same expected results every time you use it. M 1 Yes The tool recovers automatically from common failures. S 1 The tool seems too do this The tool recovers from failures quickly, minimizing data loss, for example by automatically saving intermediate test progress results. S 1 The data is stored in mongoDB, so no data is lost The tool handles errors gracefully and informs users of any issues. S 1 If the email server is down the tool still operates The tool provides clear error messages and instructions for troubleshooting. S 0.8 Some errors are not very informative when you get them, but mostly email related are <p>total_score = 15.4</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#usability","title":"Usability","text":"Requirement Priority Fulfilled Comments The tool possess a clean, intuitive, and visually appealing UI that follows industry standards. S 1 Very clean UI The tool provides clear and consistent navigation, making it easy for users to find what they need. S 1 Compared to AIVerify the navigation is very intuitive (but it also has less features) The tool is responsive and provides instant feedback. S 1 Yes The user interface is multilingual and supports at least English. S 0.8 There is support for multilingual, but the assessments are not translated and needs to be translated by hand The tool offers keyboard shortcuts for efficient interaction. C 0 No The user interface can easily be translated into other languages. C 0.8 The buttons are automatically translated but not the assessment itself <p>total_score = 13</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#help-documentation","title":"Help &amp; Documentation","text":"Requirement Priority Fulfilled Comments The tool provides comprehensive online help documentation with searchable functionalities. S 0.1 There is little documentation, only the website and the github readme The tool offers context-sensitive help within the application. C 0 The icons are just very clear, would be nice to have a question mark at some places The online documentation includes video tutorials and training materials for ease of learning. C 0 There is no such documentation The project provides readily available customer support through various channels  (e.g., email, phone, online chat) to address user inquiries and troubleshoot issues. C 0.25 You can issue tickets on Github, no other way supported way <p>total_score = 0.55</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#performance-efficiency","title":"Performance Efficiency","text":"Requirement Priority Fulfilled Comments The tool operates efficiently and minimize resource utilization. M 1 The docker container is not so very big, also doesn't use much resources The tool responds to user actions instantly. M 1 There is instant feedback in the UI The tool is scalable to accommodate increased user base and data volume. S 1 As it runs on Docker, you can scale this on Kubernetes for multiple users <p>total_score = 11</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#maintainability","title":"Maintainability","text":"Requirement Priority Fulfilled Comments The tool is easy to modify and maintain. M 0.8 You need to be a bit aware of NextJS, then it is easy to maintain as it is not such a large tool The tool adheres to industry coding standards and best practices to ensure code quality and maintainability. M 0.8 The code looks well structured, they have deployments on github but I don't see any CI or pre-commit hooks The code is written in a common, widely adopted and supported and actively used and maintained programming language. M 1 NextJS is very common for frontend tools The project provides version control for code changes and rollback capabilities. M 1 The code is hosted on Github so yes The project is open source. M 1 see above It is possible to contribute to the source. S 1 It is possible, not many people have done this yet The system is modular, allowing for easy modification of individual components. S 0.6 Extra assessments can be appended to the system, but not in such a way that it supports multiple (different) assessments, but roles can be changed very easily Diagnostic tools are available to identify and troubleshoot issues. S 0.8 The standard NextJS tools to troubleshoot, but not many tests <p>total_score = 25.6</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#security","title":"Security","text":"Requirement Priority Fulfilled Comments The tool must protect data and system from unauthorized access, use, disclosure, disruption, modification, or destruction. M 1 The data is stored in MongoDB Regular security audits and penetration testing are conducted. S 0 When running docker compose up, the docker client will tell there are quite some CVE vulnerabilities in there, an upgrade of the Node version would help much here The tool enforce authorization controls based on user roles and permissions, restricting access to sensitive data and functionalities. C 0.5 The tool has support for multiple users and roles (but we couldn't find a user management system) Data encryption is used for sensitive information at rest and in transit. C 1 When data is transferred to mongoDB, a secure connection is set-up and also in the DB it is encrypted by MongoDB, also you have an SSL connection with the tool The project allows for regular security audits and penetration testing to identify vulnerabilities and ensure system integrity. C 1 The tool does allow this, as it is open-source The tool implements backup functionality to ensure data availability in case of incidents. C 1 The data is store in a volume next to the main container of the <p>total_score = 7.5</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#compatibility","title":"Compatibility","text":"Requirement Priority Fulfilled Comments The tool is compatible with existing systems and infrastructure. M 1 As it is a container it can run on Kubernetes and therefore at Digilab The tool supports industry-standard data formats and protocols. M 1 Assessment and other config are stored in JSON The tool operates seamlessly on supported operating systems and hardware platforms. S 1 As it runs in a container it is able to run on all the major OSes if you have Docker Desktop or use a cloud version managed by yourself The tool supports commonly used data formats (e.g., CSV, Excel, JSON) for easy data exchange with other systems and tools. S 0 The tool currently only exports a pdf which is not an exchangeable format The tool integrates with existing security solutions. C 0 Not applicable as it is an UI <p>total_score = 11</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#accessibility","title":"Accessibility","text":"Requirement Priority Fulfilled Comments The tool is accessible to users with disabilities, following relevant accessibility standards (e.g., WCAG). S 0.1 The color scheme is pretty good viewable, but for the rest there are not accessibility features <p>total_score = 0.3</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#portability","title":"Portability","text":"Requirement Priority Fulfilled Comments The tool support a range of operating systems (e.g., Windows, macOS, Linux) commonly used within an organization. S 1 It is in docker so can run everywhere The tool minimizes dependencies on specific hardware or software configurations, promoting flexibility. S 1 This is all containerized The tool offers a cloud-based deployment option or be compatible with cloud environments for scalability and accessibility. S 1 As it is containerized we could host this ourselves in a cloud environment, the Belgium government does not offer a hosted version for you The tool adheres to relevant cloud security standards and best practices. S 0.8 The docker container does contain some outdated versions of for example Node. <p>total_score = 11.4</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#deployment","title":"Deployment","text":"Requirement Priority Fulfilled Comments The tool has an easy and user-friendly installation and configuration process. S 1 It was very easy to install out-of-the-box The tool has on-premise or cloud-based deployment options to cater to different organizational needs and infrastructure. S 0 The tool does not promise on-prem or cloud-based managed deployments <p>total_score = 3</p>"},{"location":"projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/#legal-compliance","title":"Legal &amp; Compliance","text":"Requirement Priority Fulfilled Comments It is clear how the tool is funded to avoid improper influence due to conflicts of interest M 1 It is funded by the Belgian Government The tool is compliant with relevant legal and regulatory requirements. S 1 Yes EU license The tool adheres to (local) data privacy regulations like GDPR, ensuring the protection of user data. S 1 Data is stored locally The tool implements appropriate security measures to comply with industry regulations and standards. S 1 EUPL 1.2 license (although they say they have MIT license) The tool is licensed for use within the organization according to the terms and conditions of the license agreement. S 1 Yes, see above The tool respects intellectual property rights and avoid copyright infringement issues. S 1 Yes, see above <p>total_score = 19</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/","title":"AI Verify","text":"<p>See the introduction</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#functionality","title":"Functionality","text":"Requirement Priority Fulfilled Comments The tool allows users to conduct technical tests on algorithms or models, including assessments of performance, bias, and fairness. To facilitate these tests, users can input relevant datasets, M 1 This is core functionality of AIVerify The tool allows users to choose which tests to perform. M 1 This is core functionality of AIVerify The tool allows users to fill out questionnaires to conduct impact assessments for AI. For example IAMA or ALTAI. M 1 This is core functionality of AIVerify, however work is needed to add extra impact assessments The tool can generate a human readable report. M 1 This is core functionality of AIVerify The tools works with a standardized report format, that it can read, write, and update. M 0 The outputted format is a PDF format, so this cannot be updated, or easily read by a machine. The tool supports plugin functionality so additional tests can be added easily. S 0.5 One can add a test as a plugin, it can however be a bit too technical still for many people. The tool allows to create custom reports based on components. S 1 One can slide the technical tests results and the assessment test results into a report which will be placed into a PDF It is possible to add custom components for reports. S 1 It is possible, but just like with tests can be hard for non-technical people The tool provides detailed logging, including tracking of different model versions, changes in impact assessments, and technical test results for individual runs. S 0.5 There are versions of models when uploaded, and the report itself is the technical test result of a run. Changes to impact assessments are not logged (only when a report is generated) The tool supports saving progress. S 1 Reports can be saved, while it is being constructed The tool can be used on an isolated system without an internet connection. S 1 Locally the docker container can be build and ran The tool offers options to discuss and document conversations. For example, to converse about technical tests or to collaborate on impact assessments. C 0 Only the end-result will be logged into the report The tool operates with complete data privacy; it does not share any data or logging information. C 1 The application is a docker application and does not do this The tool allows extension of report formats functionality. C 1 We could program this functionality in the tool and submit a PR The tool can be integrated in a CI/CD flow. C 0.5 It is possible, but would be very heavy to do so. The build time is quite large, and only the technical tests could be ran in an automated fashion The tool can be offered as a (cloud) service where no local installation is required. C 0 AIVerify is currently not doing this, we could however offer it as a cloud service It is possible to define and automate workflows for repetitive tasks. C 0 As this tool is focused on UI, this is not possible The tool offers pre-built connectors or low-code/no-code integration options to simplify the integration process. C 0 This is not included <p>total_score = 36</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#reliability","title":"Reliability","text":"Requirement Priority Fulfilled Comments The tool operates consistently and reliably, meaning it delivers the same expected results every time you use it. M 1 The tool did not break down a single time while we were coding a plugin (only threw errors) The tool recovers automatically from common failures. S 1 Common failures like missing datasets or models are not breaking The tool recovers from failures quickly, minimizing data loss, for example by automatically saving intermediate test progress results. S 0.5 The assessments you need to manually save otherwise it will be lost, but over different sessions the data will be stored persistent even if the containers go down. Test results are only stored in the generated report The tool handles errors gracefully and informs users of any issues. S 1 When failed to generate a report the tool will log the error messages, otherwise when loading in data that is non existing the application (while not being very clear in error message) just continues with an error The tool provides clear error messages and instructions for troubleshooting. S 0.5 The test-engine-core is a dependency that is installed as a package, and therefore the error message will not contain error in that package <p>total_score = 13</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#usability","title":"Usability","text":"Requirement Priority Fulfilled Comments The tool possess a clean, intuitive, and visually appealing UI that follows industry standards. S 1 The tool does follow the material design principles for example when you hover over items they will respond to user input The tool provides clear and consistent navigation, making it easy for users to find what they need. S 0.5 It is not completely clear where in the tool you are when interacting with it and sometimes you could go back to home but not always The tool is responsive and provides instant feedback. S 1 Even for jobs like generating tests and the report, it scheduled jobs and will notify you when it is done The user interface is multilingual and supports at least English. S 0.5 Currently it only supports english The tool offers keyboard shortcuts for efficient interaction. C 0 It is mainly UI and therefore no keyboard shortcuts The user interface can easily be translated into other languages. C 0.2 It would need quite some refactoring when adding support for the Dutch Language (especially the more technical words like Warning or the metadata on all the plugins <p>total_score = 9.4</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#help-documentation","title":"Help &amp; Documentation","text":"Requirement Priority Fulfilled Comments The tool provides comprehensive online help documentation with searchable functionalities. S 0.8 From the end-user perspective yes, from the development perspective no (for example that you need to rebuild packages like the test-engine-core The tool offers context-sensitive help within the application. C 0 Not included in the tool The online documentation includes video tutorials and training materials for ease of learning. C 0 Although it contains many images The project provides readily available customer support through various channels  (e.g., email, phone, online chat) to address user inquiries and troubleshoot issues. C 0.2 Just email, which they do not respond to very quickly <p>total_score = 2.8</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#performance-efficiency","title":"Performance Efficiency","text":"Requirement Priority Fulfilled Comments The tool operates efficiently and minimize resource utilization. M 0.5 The tool is efficient, minimal waiting and no lag although it uses up quite some resources which could be optimized The tool responds to user actions instantly. M 1 Instantaneous response time The tool is scalable to accommodate increased user base and data volume. S 0.5 As it is built into a container it can be made scalable with Kubernetes, but the the tool itself can become very slow when generating results for a large dataset and model (because of the extra overhead) <p>total_score = 7.5</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#maintainability","title":"Maintainability","text":"Requirement Priority Fulfilled Comments The tool is easy to modify and maintain. M 0.2 Adding a new plugin for a model type was quite hard, other plugins however are more easier The tool adheres to industry coding standards and best practices to ensure code quality and maintainability. M 0.2 The docker side of the project could have a big improvement The code is written in a common, widely adopted and supported and actively used and maintained programming language. M 1 Backend in Python, Frontend in NextJs The project provides version control for code changes and rollback capabilities. M 0.8 The code is stored on Github, but the container itself not and also the packages which the tools depend on not The project is open source. M 1 Github link It is possible to contribute to the source. S 0.5 It is possible, although with our three features it takes a while for them to dedicated time for integration The system is modular, allowing for easy modification of individual components. S 0.5 The technical tests and assessments are easy to adjust, other core features not Diagnostic tools are available to identify and troubleshoot issues. S 0 Diagnosing some parts of the system took us quite some time as we couldn't properly debug in the containers <p>total_score = 15.8</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#security","title":"Security","text":"Requirement Priority Fulfilled Comments The tool must protect data and system from unauthorized access, use, disclosure, disruption, modification, or destruction. M 0.5 This managed by that the data is stored in MongoDB however, it currently only has 1 user support Regular security audits and penetration testing are conducted. S 0.1 We are unaware of the security audits but they do have a security policy here The tool enforce authorization controls based on user roles and permissions, restricting access to sensitive data and functionalities. C 0 Currently only 1 user can use the system and see all the data Data encryption is used for sensitive information at rest and in transit. C 1 When data is transferred to mongoDB, a secure connection is set-up and also in the DB it is encrypted by MongoDB, also you have an SSL connection with the tool The project allows for regular security audits and penetration testing to identify vulnerabilities and ensure system integrity. C 1 As you can install it locally, this is possible The tool implements backup functionality to ensure data availability in case of incidents. C 1 Data is stored persistent, so even if the tool breaks the data will be in volumes <p>total_score = 8.3</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#compatibility","title":"Compatibility","text":"Requirement Priority Fulfilled Comments The tool is compatible with existing systems and infrastructure. M 1 As it is a container it can run on Kubernetes and therefore at Digilab The tool supports industry-standard data formats and protocols. M 1 Most Datasets and Models are supported by the tool The tool operates seamlessly on supported operating systems and hardware platforms. S 1 As it runs in a container it is able to run on all the major OS'es if you have Docker Desktop or use a cloud version managed by yourself The tool supports commonly used data formats (e.g., CSV, Excel, JSON) for easy data exchange with other systems and tools. S 0.5 As input many types are accepted, but only as export there is a PDF report The tool integrates with existing security solutions. C 0 It does not integrate with security solutions <p>total_score = 12.5</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#accessibility","title":"Accessibility","text":"Requirement Priority Fulfilled Comments The tool is accessible to users with disabilities, following relevant accessibility standards (e.g., WCAG). S 0 It is not clear what the tool actually does with one look, also the color change when hovering over elements is not a large difference compared to the original color (the purple and pink) <p>total_score = 0</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#portability","title":"Portability","text":"Requirement Priority Fulfilled Comments The tool support a range of operating systems (e.g., Windows, macOS, Linux) commonly used within an organization. S 1 It is containerized The tool minimizes dependencies on specific hardware or software configurations, promoting flexibility. S 1 This is all containerized The tool offers a cloud-based deployment option or be compatible with cloud environments for scalability and accessibility. S 1 As it is containerized we could host this ourselves in a cloud environment The tool adheres to relevant cloud security standards and best practices. S 0.5 The making of the container it self is lacking some best practices, otherwise the cloud security standards are not applicable as it is a self-hosted tool <p>total_score = 10.5</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#deployment","title":"Deployment","text":"Requirement Priority Fulfilled Comments The tool has an easy and user-friendly installation and configuration process. S 0.5 You need to be technical to be able to install and deploy, but then it is relatively easy The tool has on-premise or cloud-based deployment options to cater to different organizational needs and infrastructure. S 0 The tool does not promise on-prem or cloud-based managed deployments <p>total_score = 1.5</p>"},{"location":"projects/tad/existing-tools/checklists/aiverify_checklist/#legal-compliance","title":"Legal &amp; Compliance","text":"Requirement Priority Fulfilled Comments It is clear how the tool is funded to avoid improper influence due to conflicts of interest M 1 On the website it is stated, that many commercial partners fund this project The tool is compliant with relevant legal and regulatory requirements. S 1 The tool adheres to (local) data privacy regulations like GDPR, ensuring the protection of user data. S 1 The tool implements appropriate security measures to comply with industry regulations and standards. S 1 The tool is licensed for use within the organization according to the terms and conditions of the license agreement. S 1 Apache 2.0 license The tool respects intellectual property rights and avoid copyright infringement issues. S 1 <p>total_score = 19</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/","title":"Holistic AI","text":"<p>See the introduction. It is a toolkit just like IBM-360-Toolkit for a data scientist to research bias and also to mitigate it immediately.</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#functionality","title":"Functionality","text":"Requirement Priority Fulfilled Comments The tool allows users to conduct technical tests on algorithms or models, including assessments of performance, bias, and fairness. To facilitate these tests, users can input relevant datasets, M 1 The tests which can be executed are written here The tool allows users to choose which tests to perform. M 1 In code the user is free to choose any test The tool allows users to fill out questionnaires to conduct impact assessments for AI. For example IAMA or ALTAI. M 0 The tool only does technical tests The tool can generate a human readable report. M 0 The toolkit itself cannot make a human readable report, it only generates results which then needs to be interpreted The tools works with a standardized report format, that it can read, write, and update. M 0 The only format it outputs are specific numbers, so no standardized format or even een report format The tool supports plugin functionality so additional tests can be added easily. S 0 All the bias tests are put in a single script which making additional tests a bit cumbersome and leas developer-friendly The tool allows to create custom reports based on components. S 0 Does not allow reports export It is possible to add custom components for reports. S 0 Does not allow reports export The tool provides detailed logging, including tracking of different model versions, changes in impact assessments, and technical test results for individual runs. S 0 Not ouf of the box, but this could be written in code by the owner of the algorithm The tool supports saving progress. S 0 Not ouf of the box, but this could be written in code by the owner of the algorithm The tool can be used on an isolated system without an internet connection. S 1 As a python tool this is possible The tool offers options to discuss and document conversations. For example, to converse about technical tests or to collaborate on impact assessments. C 0 This is not supported The tool operates with complete data privacy; it does not share any data or logging information. C 1 The local tool does not share anything to the outside world The tool allows extension of report formats functionality. C 0 This is not what the tool is built for The tool can be integrated in a CI/CD flow. C 1 As it is a python package it can be included in a CI pipeline The tool can be offered as a (cloud) service where no local installation is required. C 0 Not immediately, an UI needs to be build around it It is possible to define and automate workflows for repetitive tasks. C 1 Automated tests could be programmed specifically from this tool The tool offers pre-built connectors or low-code/no-code integration options to simplify the integration process. C 0 Not supported by the tool <p>total_score = 17</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#reliability","title":"Reliability","text":"Requirement Priority Fulfilled Comments The tool operates consistently and reliably, meaning it delivers the same expected results every time you use it. M 1 The tool recovers automatically from common failures. S 1 The tool recovers from failures quickly, minimizing data loss, for example by automatically saving intermediate test progress results. S 1 The tool handles errors gracefully and informs users of any issues. S 1 The tool provides clear error messages and instructions for troubleshooting. S 1 <p>total_score = 16</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#usability","title":"Usability","text":"Requirement Priority Fulfilled Comments The tool possess a clean, intuitive, and visually appealing UI that follows industry standards. S 0 There is no user-interface The tool provides clear and consistent navigation, making it easy for users to find what they need. S 0 There is no user-interface The tool is responsive and provides instant feedback. S 0 There is no user-interface The user interface is multilingual and supports at least English. S 0 There is no user-interface The tool offers keyboard shortcuts for efficient interaction. C 0 There is no user-interface The user interface can easily be translated into other languages. C 0 There is no user-interface <p>total_score = 0</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#help-documentation","title":"Help &amp; Documentation","text":"Requirement Priority Fulfilled Comments The tool provides comprehensive online help documentation with searchable functionalities. S 0.2 There is some documentation but it is not very helpful The tool offers context-sensitive help within the application. C 0 As a Python tool, no The online documentation includes video tutorials and training materials for ease of learning. C 0 Ths is not there The project provides readily available customer support through various channels  (e.g., email, phone, online chat) to address user inquiries and troubleshoot issues. C 0.5 You can contact sales through their website and respond on Github, Github seems to be an okay response time (but not a large community) <p>total_score = 1.6</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#performance-efficiency","title":"Performance Efficiency","text":"Requirement Priority Fulfilled Comments The tool operates efficiently and minimize resource utilization. M 1 very lightweight as a python package The tool responds to user actions instantly. M 1 It will return output instantly The tool is scalable to accommodate increased user base and data volume. S 1 This would be installed distributed and therefore would be scalable, with large datasets it is still very quick <p>total_score = 11</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#maintainability","title":"Maintainability","text":"Requirement Priority Fulfilled Comments The tool is easy to modify and maintain. M 0.5 It is less modular because most of the tests are written in a single script The tool adheres to industry coding standards and best practices to ensure code quality and maintainability. M 0.5 They use pre-commit hooks, but the codebase seems to be a bit weirdly structured The code is written in a common, widely adopted and supported and actively used and maintained programming language. M 1 It is written in Python The project provides version control for code changes and rollback capabilities. M 1 It is hosted on Github The project is open source. M 1 Hosted here It is possible to contribute to the source. S 1 It is possible and they respond to contributions The system is modular, allowing for easy modification of individual components. S 0.5 See the first point Diagnostic tools are available to identify and troubleshoot issues. S 1 Just standard python troubleshooting tools <p>total_score = 23.5</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#security","title":"Security","text":"Requirement Priority Fulfilled Comments The tool must protect data and system from unauthorized access, use, disclosure, disruption, modification, or destruction. M 0 Not applicable Regular security audits and penetration testing are conducted. S 0 It is not stated on the repository that they do something with security The tool enforce authorization controls based on user roles and permissions, restricting access to sensitive data and functionalities. C 0 The tool does not have Users or Access control Data encryption is used for sensitive information at rest and in transit. C 0 Transitionary data is not stored The project allows for regular security audits and penetration testing to identify vulnerabilities and ensure system integrity. C 1 This is not blocked by the tool The tool implements backup functionality to ensure data availability in case of incidents. C 0 Not supported <p>total_score = 2</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#compatibility","title":"Compatibility","text":"Requirement Priority Fulfilled Comments The tool is compatible with existing systems and infrastructure. M 1 It can be imported in Python The tool supports industry-standard data formats and protocols. M 0 it does not standardize at all in the output of the tests The tool operates seamlessly on supported operating systems and hardware platforms. S 1 Python can be ran on any system The tool supports commonly used data formats (e.g., CSV, Excel, JSON) for easy data exchange with other systems and tools. S 1 If it can be imported in Python/R it is supported The tool integrates with existing security solutions. C 0 Not applicable <p>total_score = 10</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#accessibility","title":"Accessibility","text":"Requirement Priority Fulfilled Comments The tool is accessible to users with disabilities, following relevant accessibility standards (e.g., WCAG). S 0 You need to be a programmer to use it, and that is not your typical user with disabilities <p>total_score = 0</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#portability","title":"Portability","text":"Requirement Priority Fulfilled Comments The tool support a range of operating systems (e.g., Windows, macOS, Linux) commonly used within an organization. S 0.5 As it is a python tool it is supported anywhere python runs The tool minimizes dependencies on specific hardware or software configurations, promoting flexibility. S 1 It is a python tool The tool offers a cloud-based deployment option or be compatible with cloud environments for scalability and accessibility. S 1 The company behind Holistic AI offers a whole range of services included an UI which uses this open-source toolkit The tool adheres to relevant cloud security standards and best practices. S 0 On their website they do not speak about where the data of their solution will go, this is not very transparent <p>total_score = 7.5</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#deployment","title":"Deployment","text":"Requirement Priority Fulfilled Comments The tool has an easy and user-friendly installation and configuration process. S 0.2 You need to have some developer knowledge and also knowledge about the technical tests to use The tool has on-premise or cloud-based deployment options to cater to different organizational needs and infrastructure. S 1 Yes the tool can be used as a cloud-based deployment but then with a whole UI around it <p>total_score = 3.6</p>"},{"location":"projects/tad/existing-tools/checklists/holisticai_checklist/#legal-compliance","title":"Legal &amp; Compliance","text":"Requirement Priority Fulfilled Comments It is clear how the tool is funded to avoid improper influence due to conflicts of interest M 1 The tool is owned by a private company but has been made open source to the public The tool is compliant with relevant legal and regulatory requirements. S 1 Under the apache 2.0 license The tool adheres to (local) data privacy regulations like GDPR, ensuring the protection of user data. S 1 Data stays locally The tool implements appropriate security measures to comply with industry regulations and standards. S 0 The repo does not speak about security at all The tool is licensed for use within the organization according to the terms and conditions of the license agreement. S 1 Under the apache 2.0 license The tool respects intellectual property rights and avoid copyright infringement issues. S 1 <p>total_score = 16</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/","title":"IBM Research 360 Toolkit","text":"<p>See the introduction, same thing as verifyML this has no frontend baked in, but has some nice integrations with MLops tooling like Kubeflow Pipelines. The IBM Research 360 toolkit is actually a collection of three open-source toolkits as stated by their Github repo; AI Fairness 360, AI Explainability 360, Adversarial Robustness 360. The strong suite of this toolkit that it considers bias in the whole lifecycle of the model; (dataset, training, output).</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#functionality","title":"Functionality","text":"Requirement Priority Fulfilled Comments The tool allows users to conduct technical tests on algorithms or models, including assessments of performance, bias, and fairness. To facilitate these tests, users can input relevant datasets, M 1 Fairness, Explainability and security can be tested with the suite of tools The tool allows users to choose which tests to perform. M 1 The websites of contain a whole explanation of which tests to perform AIF Website, AIX website, ART website The tool allows users to fill out questionnaires to conduct impact assessments for AI. For example IAMA or ALTAI. M 0 The tool only does technical tests The tool can generate a human readable report. M 0 The toolkit itself cannot make a human readable report, it only generates results which then needs to be interpreted The tools works with a standardized report format, that it can read, write, and update. M 0 The only format it outputs are specific numbers, so no standardized format or even een report format The tool supports plugin functionality so additional tests can be added easily. S 1 Only the repository new tests could be added quite easily if you understand Python The tool allows to create custom reports based on components. S 0 The tool does not generate reports It is possible to add custom components for reports. S 0 The tool does not generate reports The tool provides detailed logging, including tracking of different model versions, changes in impact assessments, and technical test results for individual runs. S 0 Not ouf of the box, but this could be written in code by the owner of the algorithm The tool supports saving progress. S 0 Not ouf of the box, but this could be written in code by the owner of the algorithm The tool can be used on an isolated system without an internet connection. S 1 As it can be imported as a python or r library The tool offers options to discuss and document conversations. For example, to converse about technical tests or to collaborate on impact assessments. C 0 This is not supported, there is no UI The tool operates with complete data privacy; it does not share any data or logging information. C 1 The tool does not share data The tool allows extension of report formats functionality. C 0 The tool does not generate reports The tool can be integrated in a CI/CD flow. C 1 As it is a programming toolkit it can be used in a CI/CD pipeline The tool can be offered as a (cloud) service where no local installation is required. C 0 not immediately, then an UI needs to be made It is possible to define and automate workflows for repetitive tasks. C 1 We could automate specific tests which we deem necessary or standard The tool offers pre-built connectors or low-code/no-code integration options to simplify the integration process. C 0 Purely written in Python <p>total_score = 20</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#reliability","title":"Reliability","text":"Requirement Priority Fulfilled Comments The tool operates consistently and reliably, meaning it delivers the same expected results every time you use it. M 1 The tool recovers automatically from common failures. S 1 The tool recovers from failures quickly, minimizing data loss, for example by automatically saving intermediate test progress results. S 1 The tool handles errors gracefully and informs users of any issues. S 1 The tool provides clear error messages and instructions for troubleshooting. S 1 <p>total_score = 16</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#usability","title":"Usability","text":"Requirement Priority Fulfilled Comments The tool possess a clean, intuitive, and visually appealing UI that follows industry standards. S 0 There is no user-interface The tool provides clear and consistent navigation, making it easy for users to find what they need. S 0 There is no user-interface The tool is responsive and provides instant feedback. S 0 There is no user-interface The user interface is multilingual and supports at least English. S 0 There is no user-interface The tool offers keyboard shortcuts for efficient interaction. C 0 There is no user-interface The user interface can easily be translated into other languages. C 0 There is no user-interface <p>total_score = 0</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#help-documentation","title":"Help &amp; Documentation","text":"Requirement Priority Fulfilled Comments The tool provides comprehensive online help documentation with searchable functionalities. S 0.8 On the website of the specific toolkit you can find many docs but you cannot search The tool offers context-sensitive help within the application. C 0 Within the application (as it is not an UI, does not offer specific help) The online documentation includes video tutorials and training materials for ease of learning. C 1 The amount of tutorials is extensive even videos of its usage The project provides readily available customer support through various channels  (e.g., email, phone, online chat) to address user inquiries and troubleshoot issues. C 1 You can ask questions at the repository, but also in slack and many people are using this <p>total_score = 6.4</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#performance-efficiency","title":"Performance Efficiency","text":"Requirement Priority Fulfilled Comments The tool operates efficiently and minimize resource utilization. M 1 very lightweight as a python package The tool responds to user actions instantly. M 1 It will return output instantly The tool is scalable to accommodate increased user base and data volume. S 1 This would be installed distributed and therefore would be scalable, with large datasets it is still very quick <p>total_score = 11</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#maintainability","title":"Maintainability","text":"Requirement Priority Fulfilled Comments The tool is easy to modify and maintain. M 1 The repositories are very well structured and therefore easy to adjust The tool adheres to industry coding standards and best practices to ensure code quality and maintainability. M 1 Although it doesn't have pre-commit hooks it does have a CONTRIBUTING.rst where the rules of good practices are written down The code is written in a common, widely adopted and supported and actively used and maintained programming language. M 1 It is written in Python The project provides version control for code changes and rollback capabilities. M 1 The code is hosted on Github The project is open source. M 1 At the beginning of this doc you can find the links to the repositories It is possible to contribute to the source. S 1 They have merged many outside requests, so this is fine The system is modular, allowing for easy modification of individual components. S 1 Tests can very easily be added if you understand Python Diagnostic tools are available to identify and troubleshoot issues. S 1 Just standard python troubleshooting tools <p>total_score = 29</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#security","title":"Security","text":"Requirement Priority Fulfilled Comments The tool must protect data and system from unauthorized access, use, disclosure, disruption, modification, or destruction. M 0 not applicable Regular security audits and penetration testing are conducted. S 0 It is not stated on the repository that they do something with security The tool enforce authorization controls based on user roles and permissions, restricting access to sensitive data and functionalities. C 0 The tool does not have Users or Access control Data encryption is used for sensitive information at rest and in transit. C 0 Transitionary data is not stored The project allows for regular security audits and penetration testing to identify vulnerabilities and ensure system integrity. C 1 This is not blocked by the tool The tool implements backup functionality to ensure data availability in case of incidents. C 0 Not supported <p>total_score = 2</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#compatibility","title":"Compatibility","text":"Requirement Priority Fulfilled Comments The tool is compatible with existing systems and infrastructure. M 1 It can easily be imported in Python or R The tool supports industry-standard data formats and protocols. M 0.5 It does not standardize really on any output from the tests The tool operates seamlessly on supported operating systems and hardware platforms. S 1 As a python and R tool it can be run on systems where these can be ran The tool supports commonly used data formats (e.g., CSV, Excel, JSON) for easy data exchange with other systems and tools. S 1 These can be used if they are imported in python and R The tool integrates with existing security solutions. C 1 The Adversarial Robustness Toolbox can be used to test for the security of AI Systems <p>total_score = 14</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#accessibility","title":"Accessibility","text":"Requirement Priority Fulfilled Comments The tool is accessible to users with disabilities, following relevant accessibility standards (e.g., WCAG). S 0 You need to be a programmer to use it, and that is not your typical user with disabilities <p>total_score = 0</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#portability","title":"Portability","text":"Requirement Priority Fulfilled Comments The tool support a range of operating systems (e.g., Windows, macOS, Linux) commonly used within an organization. S 0.7 If you can run python, which is not always possible within the government for example, but R could be more easy to be run on places The tool minimizes dependencies on specific hardware or software configurations, promoting flexibility. S 1 Just a python tool, no UI which is fairly minimal The tool offers a cloud-based deployment option or be compatible with cloud environments for scalability and accessibility. S 0 It is not offered as a cloud-based option The tool adheres to relevant cloud security standards and best practices. S 0 Not relevant <p>total_score = 5.1</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#deployment","title":"Deployment","text":"Requirement Priority Fulfilled Comments The tool has an easy and user-friendly installation and configuration process. S 0.4 You need to have some developer knowledge and also knowledge about the technical tests to use. But then it is quite easy and works fairly quickly The tool has on-premise or cloud-based deployment options to cater to different organizational needs and infrastructure. S 0 Not applicable <p>total_score = 1.2</p>"},{"location":"projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/#legal-compliance","title":"Legal &amp; Compliance","text":"Requirement Priority Fulfilled Comments It is clear how the tool is funded to avoid improper influence due to conflicts of interest M 1 The tool was from IBM, but slowly they are removing the IBM branding from this and the tool is now owned by the LF AI Foundation (where big companies are part of) The tool is compliant with relevant legal and regulatory requirements. S 1 All three tools have apache 2.0 license The tool adheres to (local) data privacy regulations like GDPR, ensuring the protection of user data. S 1 Data will stay local The tool implements appropriate security measures to comply with industry regulations and standards. S 0 Nothing is known about the security measures of the toolkits The tool is licensed for use within the organization according to the terms and conditions of the license agreement. S 1 All three tools have apache 2.0 license The tool respects intellectual property rights and avoid copyright infringement issues. S 1 The specific tests are implementations of papers which are open for everyone <p>total_score = 16</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/","title":"VerifyML","text":"<p>See the introduction, the maker also suggests to use an front-end tool to collaboratively change the model card. Model Card Editor this is not open-source and also the developer suggests in this issue to not use this tool but to use tools like AIVerify. This checklist only looks at the verifyML python toolkit and not the web interface.</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#functionality","title":"Functionality","text":"Requirement Priority Fulfilled Comments The tool allows users to conduct technical tests on algorithms or models, including assessments of performance, bias, and fairness. To facilitate these tests, users can input relevant datasets, M 1 The tool does allow a few standardized tests, specified here The tool allows users to choose which tests to perform. M 1 In code the user is free to choose any test The tool allows users to fill out questionnaires to conduct impact assessments for AI. For example IAMA or ALTAI. M 0 The tool can generate a human readable report. M 1 The tool can visualize model cards that are generated by it The tools works with a standardized report format, that it can read, write, and update. M 1 It generates html which can be imported by a machine The tool supports plugin functionality so additional tests can be added easily. S 1 Any test can be ran by the user itself and the output imported in the model card generated by the tool The tool allows to create custom reports based on components. S 0 It doesn't offer any standardization in what to put in the report It is possible to add custom components for reports. S 1 Anything can be put in the model card, which makes it very flexible The tool provides detailed logging, including tracking of different model versions, changes in impact assessments, and technical test results for individual runs. S 0 Not ouf of the box, but this could be written in code by the owner of the algorithm The tool supports saving progress. S 1 Once the modelcard is generated it could be loaded in again and be changed The tool can be used on an isolated system without an internet connection. S 1 Once the tool is imported in python it can be used without an internet connection The tool offers options to discuss and document conversations. For example, to converse about technical tests or to collaborate on impact assessments. C 0 Assessments are not supported The tool operates with complete data privacy; it does not share any data or logging information. C 1 It does not do this The tool allows extension of report formats functionality. C 1 As it exports html, it can also be transferred to json or markdown The tool can be integrated in a CI/CD flow. C 1 The automated tests could be ran in the CI/CD tool to generated a model card The tool can be offered as a (cloud) service where no local installation is required. C 0 The python tool itself not, but a frontend which needs to be developed yes It is possible to define and automate workflows for repetitive tasks. C 1 As it is written in python this can be automated easily The tool offers pre-built connectors or low-code/no-code integration options to simplify the integration process. C 0 The tool does this not <p>total_score = 42</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#reliability","title":"Reliability","text":"Requirement Priority Fulfilled Comments The tool operates consistently and reliably, meaning it delivers the same expected results every time you use it. M 1 Once you have located the right (older) libraries it runs pretty smoothly and reliably The tool recovers automatically from common failures. S 0 Library dependencies needs to be solved by yourself as this is not handled by the tool (especially graphs) The tool recovers from failures quickly, minimizing data loss, for example by automatically saving intermediate test progress results. S 0 It does not store any intermediary results The tool handles errors gracefully and informs users of any issues. S 0 It just breaks, you need to explicitly export the model card for it to saved The tool provides clear error messages and instructions for troubleshooting. S 0 The error messages are python error messages unrelated to the tool <p>total_score = 4</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#usability","title":"Usability","text":"Requirement Priority Fulfilled Comments The tool possess a clean, intuitive, and visually appealing UI that follows industry standards. S 0 There is no user interface The tool provides clear and consistent navigation, making it easy for users to find what they need. S 0 There is no user interface The tool is responsive and provides instant feedback. S 0 There is no user interface The user interface is multilingual and supports at least English. S 0 There is no user interface The tool offers keyboard shortcuts for efficient interaction. C 0 There is no user interface The user interface can easily be translated into other languages. C 0 There is no user interface <p>total_score = 0</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#help-documentation","title":"Help &amp; Documentation","text":"Requirement Priority Fulfilled Comments The tool provides comprehensive online help documentation with searchable functionalities. S 0.5 The documentation is quite concise and helpful, but it is outdated The tool offers context-sensitive help within the application. C 0 No context info whatsoever The online documentation includes video tutorials and training materials for ease of learning. C 0 Just documentation The project provides readily available customer support through various channels  (e.g., email, phone, online chat) to address user inquiries and troubleshoot issues. C 0 The people who worked on the tool are quick to respond to issues, but they don't support the tool anymore <p>total_score = 1.5</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#performance-efficiency","title":"Performance Efficiency","text":"Requirement Priority Fulfilled Comments The tool operates efficiently and minimize resource utilization. M 1 Very lightweight tool, as it is a python package The tool responds to user actions instantly. M 1 When run, it returns instantly The tool is scalable to accommodate increased user base and data volume. S 1 This would be installed distributed and therefore would be scalable, with large datasets it is still very quick <p>total_score = 11</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#maintainability","title":"Maintainability","text":"Requirement Priority Fulfilled Comments The tool is easy to modify and maintain. M 1 The tool itself it not so large and written with tools we are all quite aware of The tool adheres to industry coding standards and best practices to ensure code quality and maintainability. M 1 The repository has poetry, pre-commit hooks, has a CI, and looks well structured The code is written in a common, widely adopted and supported and actively used and maintained programming language. M 1 in Python and jupyter notebooks The project provides version control for code changes and rollback capabilities. M 1 It is hosted on Github The project is open source. M 1 Apache 2.0 license It is possible to contribute to the source. S 0 The project is not active supported anymore, so we would need to make a fork and make that the main source The system is modular, allowing for easy modification of individual components. S 0.5 The idea of a model card is pretty modular, and can be changed any way we like. Adding assessments in the tool would be quite the effort Diagnostic tools are available to identify and troubleshoot issues. S 1 Just standard python troubleshooting tools <p>total_score = 24.5</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#security","title":"Security","text":"Requirement Priority Fulfilled Comments The tool must protect data and system from unauthorized access, use, disclosure, disruption, modification, or destruction. M 0 not applicable Regular security audits and penetration testing are conducted. S 0 As the tool is not actively maintained anymore The tool enforce authorization controls based on user roles and permissions, restricting access to sensitive data and functionalities. C 0 As this is a local import only, this is managed by the developer Data encryption is used for sensitive information at rest and in transit. C 0 Intermediary data is not stored, and the end result is put in html with no encryption The project allows for regular security audits and penetration testing to identify vulnerabilities and ensure system integrity. C 1 It does not block this for users to do this The tool implements backup functionality to ensure data availability in case of incidents. C 0 Not supported <p>total_score = 2</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#compatibility","title":"Compatibility","text":"Requirement Priority Fulfilled Comments The tool is compatible with existing systems and infrastructure. M 1 It can be easily imported and installed in python The tool supports industry-standard data formats and protocols. M 1 Standardized tests are used and the output format is html The tool operates seamlessly on supported operating systems and hardware platforms. S 1 As it is a python tool, anywhere where python can run this can also be run The tool supports commonly used data formats (e.g., CSV, Excel, JSON) for easy data exchange with other systems and tools. S 1 This can be imported The tool integrates with existing security solutions. C 0 It does not do such a thing <p>total_score = 14</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#accessibility","title":"Accessibility","text":"Requirement Priority Fulfilled Comments The tool is accessible to users with disabilities, following relevant accessibility standards (e.g., WCAG). S 0 You need to be a programmer to use it, and that is not your typical user with disabilities <p>total_score = 0</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#portability","title":"Portability","text":"Requirement Priority Fulfilled Comments The tool support a range of operating systems (e.g., Windows, macOS, Linux) commonly used within an organization. S 0.5 If you can run python, which is not always possible within the government for example The tool minimizes dependencies on specific hardware or software configurations, promoting flexibility. S 1 As it is a python tool The tool offers a cloud-based deployment option or be compatible with cloud environments for scalability and accessibility. S 0 It is not offered as a cloud-based option The tool adheres to relevant cloud security standards and best practices. S 0 On the github nothing is mentioned about security and for the cloud version it is not applicable <p>total_score = 4.5</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#deployment","title":"Deployment","text":"Requirement Priority Fulfilled Comments The tool has an easy and user-friendly installation and configuration process. S 0.2 You need to have some developer knowledge and also knowledge about the technical tests to use The tool has on-premise or cloud-based deployment options to cater to different organizational needs and infrastructure. S 0 Not applicable <p>total_score = 0.6</p>"},{"location":"projects/tad/existing-tools/checklists/verifyml_checklist/#legal-compliance","title":"Legal &amp; Compliance","text":"Requirement Priority Fulfilled Comments It is clear how the tool is funded to avoid improper influence due to conflicts of interest M 1 It was developed during a competition and it does not receive funding anymore The tool is compliant with relevant legal and regulatory requirements. S 1 Under the apache 2.0 license The tool adheres to (local) data privacy regulations like GDPR, ensuring the protection of user data. S 1 Data will stay local The tool implements appropriate security measures to comply with industry regulations and standards. S 0 The repo does not speak about security at all The tool is licensed for use within the organization according to the terms and conditions of the license agreement. S 1 Under the apache 2.0 license The tool respects intellectual property rights and avoid copyright infringement issues. S 1 <p>total_score = 16</p>"},{"location":"projects/tad/existing-tools/comparison/requirements/","title":"Requirements for tools for Transparency of Algorithmic Decision making","text":"<p>This document contains a checklist with requirements for tools we could use to help with the transparency of algorithmic decision making.</p> <p>The requirements are based on:</p> <ul> <li>ISO 25010 standard: This standard defines the eight quality characteristics and provides a framework for evaluating software quality.</li> <li>Industry best practices: This includes a broad range of recommendations and guidelines for IT tool development and implementation.</li> <li>Common IT tool requirements: This information was gathered by analyzing various sources, such as documentation from popular IT tools, user reviews, and articles from reputable tech publications that discuss essential features and functionalities expected from different types of IT tools.</li> <li>Internal discussion and common sense: While above sources are already exhaustive, we also used team discussions and our own knowledge.</li> </ul>"},{"location":"projects/tad/existing-tools/comparison/requirements/#overview-of-requirements","title":"Overview of requirements","text":"<p>The requirements have been given a priority based on the MoSCoW scale to allow for tool comparison.</p>"},{"location":"projects/tad/existing-tools/comparison/requirements/#functionality","title":"Functionality","text":"Requirement Priority The tool allows users to conduct technical tests on algorithms or models, including assessments of performance, bias, and fairness. To facilitate these tests, users can input relevant datasets, M The tool allows users to choose which tests to perform. M The tool allows users to fill out questionnaires to conduct impact assessments for AI. For example IAMA or ALTAI. M The tool can generate a human readable report. M The tools works with a standardized report format, that it can read, write, and update. M The tool supports plugin functionality so additional tests can be added easily. S The tool allows to create custom reports based on components. S It is possible to add custom components for reports. S The tool provides detailed logging, including tracking of different model versions, changes in impact assessments, and technical test results for individual runs. S The tool supports saving progress. S The tool can be used on an isolated system without an internet connection. S The tool offers options to discuss and document conversations. For example, to converse about technical tests or to collaborate on impact assessments. C The tool operates with complete data privacy; it does not share any data or logging information. C The tool allows extension of report formats functionality. C The tool can be integrated in a CI/CD flow. C The tool can be offered as a (cloud) service where no local installation is required. C It is possible to define and automate workflows for repetitive tasks. C The tool offers pre-built connectors or low-code/no-code integration options to simplify the integration process. C"},{"location":"projects/tad/existing-tools/comparison/requirements/#reliability","title":"Reliability","text":"Requirement Priority The tool operates consistently and reliably, meaning it delivers the same expected results every time you use it. M The tool recovers automatically from common failures. S The tool recovers from failures quickly, minimizing data loss, for example by automatically saving intermediate test progress results. S The tool handles errors gracefully and informs users of any issues. S The tool provides clear error messages and instructions for troubleshooting. S"},{"location":"projects/tad/existing-tools/comparison/requirements/#usability","title":"Usability","text":"Requirement Priority The tool possess a clean, intuitive, and visually appealing UI that follows industry standards. S The tool provides clear and consistent navigation, making it easy for users to find what they need. S The tool is responsive and provides instant feedback. S The user interface is multilingual and supports at least English. S The tool offers keyboard shortcuts for efficient interaction. C The user interface can easily be translated into other languages. C"},{"location":"projects/tad/existing-tools/comparison/requirements/#help-documentation","title":"Help &amp; Documentation","text":"Requirement Priority The tool provides comprehensive online help documentation with searchable functionalities. S The tool offers context-sensitive help within the application. C The online documentation includes video tutorials and training materials for ease of learning. C The project provides readily available customer support through various channels  (e.g., email, phone, online chat) to address user inquiries and troubleshoot issues. C"},{"location":"projects/tad/existing-tools/comparison/requirements/#performance-efficiency","title":"Performance Efficiency","text":"Requirement Priority The tool operates efficiently and minimize resource utilization. M The tool responds to user actions instantly. M The tool is scalable to accommodate increased user base and data volume. S"},{"location":"projects/tad/existing-tools/comparison/requirements/#maintainability","title":"Maintainability","text":"Requirement Priority The tool is easy to modify and maintain. M The tool adheres to industry coding standards and best practices to ensure code quality and maintainability. M The code is written in a common, widely adopted and supported and actively used and maintained programming language. M The project provides version control for code changes and rollback capabilities. M The project is open source. M It is possible to contribute to the source. S The system is modular, allowing for easy modification of individual components. S Diagnostic tools are available to identify and troubleshoot issues. S"},{"location":"projects/tad/existing-tools/comparison/requirements/#security","title":"Security","text":"Requirement Priority The tool must protect data and system from unauthorized access, use, disclosure, disruption, modification, or destruction. M Regular security audits and penetration testing are conducted. S The tool enforce authorization controls based on user roles and permissions, restricting access to sensitive data and functionalities. C Data encryption is used for sensitive information at rest and in transit. C The project allows for regular security audits and penetration testing to identify vulnerabilities and ensure system integrity. C The tool implements backup functionality to ensure data availability in case of incidents. C"},{"location":"projects/tad/existing-tools/comparison/requirements/#compatibility","title":"Compatibility","text":"Requirement Priority The tool is compatible with existing systems and infrastructure. M The tool supports industry-standard data formats and protocols. M The tool operates seamlessly on supported operating systems and hardware platforms. S The tool supports commonly used data formats (e.g., CSV, Excel, JSON) for easy data exchange with other systems and tools. S The tool integrates with existing security solutions. C"},{"location":"projects/tad/existing-tools/comparison/requirements/#accessibility","title":"Accessibility","text":"Requirement Priority The tool is accessible to users with disabilities, following relevant accessibility standards (e.g., WCAG). S"},{"location":"projects/tad/existing-tools/comparison/requirements/#portability","title":"Portability","text":"Requirement Priority The tool support a range of operating systems (e.g., Windows, macOS, Linux) commonly used within an organization. S The tool minimizes dependencies on specific hardware or software configurations, promoting flexibility. S The tool offers a cloud-based deployment option or be compatible with cloud environments for scalability and accessibility. S The tool adheres to relevant cloud security standards and best practices. S"},{"location":"projects/tad/existing-tools/comparison/requirements/#deployment","title":"Deployment","text":"Requirement Priority The tool has an easy and user-friendly installation and configuration process. S The tool has on-premise or cloud-based deployment options to cater to different organizational needs and infrastructure. S"},{"location":"projects/tad/existing-tools/comparison/requirements/#legal-compliance","title":"Legal &amp; Compliance","text":"Requirement Priority It is clear how the tool is funded to avoid improper influence due to conflicts of interest M The tool is compliant with relevant legal and regulatory requirements. S The tool adheres to (local) data privacy regulations like GDPR, ensuring the protection of user data. S The tool implements appropriate security measures to comply with industry regulations and standards. S The tool is licensed for use within the organization according to the terms and conditions of the license agreement. S The tool respects intellectual property rights and avoid copyright infringement issues. S"},{"location":"projects/tad/existing-tools/comparison/tools/","title":"Research of tools for Transparency of Algorithmic Decision making","text":"<p>In our ongoing research on AI validation and transparency, we are seeking tools to support assessments. Ideal tools would combine various technical tests with checklists and questionnaires and have the ability to generate reports in both human-friendly and machine-exchangeable formats.</p> <p>This document contains a list of tools we have found and may want to investigate further.</p>"},{"location":"projects/tad/existing-tools/comparison/tools/#ai-verify","title":"AI Verify","text":"<p>AI Verify is an AI governance testing framework and software toolkit that validates the performance of AI systems against a set of  internationally recognized principles through standardized tests, and is consistent with international AI governance frameworks such as those from European Union, OECD and Singapore.</p> <p>Links: AI Verify Homepage, AI Verify documentation, AI Verify Github.</p>"},{"location":"projects/tad/existing-tools/comparison/tools/#to-investigate-further","title":"To investigate further","text":""},{"location":"projects/tad/existing-tools/comparison/tools/#verifyml","title":"VerifyML","text":"<p>What is it? VerifyML is an opinionated, open-source toolkit and workflow to help companies implement human-centric AI practices. It seems pretty much equivalent to AI Verify.</p> <p>Why interesting? The functionality of this toolkit seems to match closely with those of AI Verify. It has a \"git and code first approach\" and has automatic generation of model cards.</p> <p>Remarks The code seems to be last updated 2 years ago.</p> <p>Links: VerifyML, VerifyML GitHub</p>"},{"location":"projects/tad/existing-tools/comparison/tools/#ibm-research-360-toolkit","title":"IBM Research 360 Toolkit","text":"<p>What is it? Open source Python libraries that supports interpretability and explainability of datasets and machine learning models. Most relevant toolkits are the AI Fairness 360 and AI Explainability 360.</p> <p>Why interesting? Seems to encompass extensive fairness and explainability tests. Codebase seems to be active.</p> <p>Remarks It comes as Python and R libraries.</p> <p>Links: AI Fairness 360 Github, AI Explainability 360 Github.</p>"},{"location":"projects/tad/existing-tools/comparison/tools/#holistic-ai","title":"Holistic AI","text":"<p>What is it? Open source tool to assess and improve the trustworthiness of AI systems. Offers tools to measure and mitigate bias across numerous tasks. Will be extended to include tools for efficacy, robustness, privacy and explainability.</p> <p>Why interesting? Although it is not entirely clear what exactly this tool does (see Remarks) it does seem (according to their website) to provide reports on bias and fairness. The Github rep does not seem to include any report generating code, but mainly technical tests. Here is an example in which bias is measured in a classification model.</p> <p>Remarks Website seems to suggest the possibility to generate reports, but this is not directly reflected in the codebase. Possibly reports are only available with some sort of licensed product?</p> <p>Links: Holistic AI Homepage, Holistic AI Github.</p>"},{"location":"projects/tad/existing-tools/comparison/tools/#ai-assessment-tool","title":"AI Assessment Tool","text":"<p>What is it? The tool is based on the ALTAI published by the European Commission. It is more of a discussion tool about AI Systems.</p> <p>Why interesting? Although it only includes questionnaires it does give an interesting way of reporting the end results. Discussions on for example IAMA can be documented as well within the tool.</p> <p>Remarks The tool of the EU itself is not open-source but the tool from Belgium is. Does not include any technical tests at this point.</p> <p>Links: AI Assessment Tool Belgium homepage AI Assessment Tool Belgium Github</p>"},{"location":"projects/tad/existing-tools/comparison/tools/#interesting-to-mention","title":"Interesting to mention","text":"<ul> <li> <p>What-if. Provides interface for expanding understanding of a black-box classification or regression ML model. Can be accessed through TensorBoard or as an extension in a Jupyter or Colab notebook. Does not seem to be an active codebase.</p> </li> <li> <p>Aequitas. Open source bias auditing and Fair ML toolkit. This already seems to be contained within AI Verify, at least the 'fairness tree'.</p> </li> <li> <p>Facets. Open source toolkit for understanding and analyzing ML datasets. Note that does not include ML models.</p> </li> <li> <p>Fairness Indicators. Open source Python package which enables easy computation of commonly-identified fairness metrics for binary and multiclass classifiers. Part of TensorFlow. k</p> </li> <li> <p>Fairlearn. Open source Python package that empowers developers of AI systems to assess their system's fairness and mitigate any observed unfairness issues.</p> </li> <li> <p>Dalex. The DALEX package x-rays any model and helps to explore and explain its behavior, helps to understand how complex models are working. The main function explain() creates a wrapper around a predictive model. Wrapped models may then be explored and compared with a collection of local and global explainers. Recent developments from the area of Interpretable Machine Learning/eXplainable Artificial Intelligence.</p> </li> <li> <p>SigmaRed. SigmaRed platform enables comprehensive third-party AI risk management (AI TPRM) and rapidly reduces the cycle time of conducting AI risks assessments while providing deep visibility, control, stakeholder based reporting, and detailed evidence repository. Does not seem to be open source.</p> </li> <li> <p>Anch.ai. The end-to-end cloud solution empowers global data-driven organizations to govern and deploy responsible, transparent, and explainable AI aligned with upcoming EU regulation AI Act. Does not seem to be open source.</p> </li> <li> <p>CredoAI. Credo AI is an AI governance platform that helps companies adopt, scale, and govern AI safely and effectively. Does not seem to be open source.</p> </li> </ul>"},{"location":"projects/tad/existing-tools/comparison/tools/#the-fate-system","title":"The FATE system","text":"<p>Paper by TNO about the FATE system. Acronym stands for \"FAir, Transparent and Explainable Decision Making.\"</p> <p>Tools mentioned include some of the above: Aequitas, AI Fairness 360, Dalex, Fairlearn, Responsibly, and What-If-Tool</p> <p>Links: Paper, Article, Microsoft links.</p>"},{"location":"projects/tad/existing-tools/comparison/tools_comparison/","title":"Comparison of tools for transparency of algorithmic decision making","text":"<p>We have researched a few tools which we want to investigate further, this document is the next step in that investigation. We created a checklist to compare these tools against. The Fulfilled column will give a numerical value based on whether that requirement is fulfilled or not between 0 and 1. Then the actual scoring is the fulfilled value times the priority (the priority is translated to numerical values in the following way: {M:4, S:3, C:2, W:-1}).</p>"},{"location":"projects/tad/existing-tools/comparison/tools_comparison/#summary-of-the-comparison","title":"Summary of the comparison","text":"Requirement AIVerify VerifyML IBM 360 Research Toolkit Holistic AI AI Assessment Tool Functionality 36 42 20 17 22.85 Reliability 13 4 16 16 15.4 Usability 9.4 0 0 0 13 Help &amp; Documentation 2.8 1.5 6.4 1.6 0.55 Performance Efficiency 7.5 11 11 11 11 Maintainability 15.8 24.5 29 23.5 25.6 Security 8.3 2 2 2 7.5 Compatibility 12.5 14 14 10 11 Accessibility 0 0 0 0 0.3 Portability 10.5 4.5 5.1 7.5 11.4 Deployment 1.5 0.6 1.2 3.6 3 Legal &amp; Compliance 19 16 16 16 19 Total 136.3 120.1 120.7 108.2 140.6"},{"location":"projects/tad/existing-tools/comparison/tools_comparison/#notable-differences-between-the-tools","title":"Notable differences between the tools","text":"<p>AIVerify notes:</p> <ul> <li> <p>Technical tests are supported, but it can be quite slow because of overhead of the tool</p> </li> <li> <p>More flexibility would need to be built in before people could use the technical tests</p> <ul> <li> <p>If you have many variables you are not able to show it in the pdf</p> </li> <li> <p>The error messages in why technical tests don't work on the model are not user-friendly</p> </li> </ul> </li> </ul> <p>VerifyML notes:</p> <ul> <li> <p>This tool is not actively developed anymore, parties transferred their focus to AIVerify</p> </li> <li> <p>This tool does not support for assessments</p> </li> </ul> <p>IBM 360 toolkit notes:</p> <ul> <li> <p>The toolkit has a strong backing of the industry and the community</p> </li> <li> <p>There are many technical tests included from the latest research, and also supports mitigation algorithms</p> </li> <li> <p>It is purely for developers and has therefore no support for assessments</p> </li> </ul> <p>Holistic AI:</p> <ul> <li> <p>Like IBM 360 Toolkit it does differentiate to different type of technical assessments like bias and explainability, but it is less extensive than the 360 toolkit</p> </li> <li> <p>The ambition is large of Holistic AI, they want to capture, Efficacy, Robustness, and Privacy tests as well</p> </li> <li> <p>It is a private company from the United Kingdom which has open sourced part of their tool</p> </li> </ul> <p>AI Assessment Tool:</p> <ul> <li> <p>This tool does not have any technical tests, but outshines the others with the discussion on assessment option</p> </li> <li> <p>It is also very performant</p> </li> </ul>"},{"location":"projects/tad/existing-tools/comparison/tools_comparison/#summary-per-tool-in-one-sentence","title":"Summary per tool in one sentence","text":"<ul> <li> <p><code>AIVerify</code> is a tool with a UI to execute both assessments and technical tests.</p> </li> <li> <p><code>VerifyML</code> is a Python package to generate Model Cards.</p> </li> <li> <p><code>Holistic AI</code> is a Python package to test for and mitigate Bias in your model.</p> </li> <li> <p><code>IBM 360 Research Toolkit</code> is a Python and R package to test for Fairness &amp; Explainability of your model.</p> </li> <li> <p><code>AI Assessment Tool</code> is a tool with a UI to execute assessments and log discussions.</p> </li> </ul>"},{"location":"projects/tad/reporting-standard/","title":"0.1a7 (Latest)","text":"<p>This document describes the Transparency of Algorithmic Decision making (TAD) Reporting Standard.</p> <p>For reproducibility, governance, auditing and sharing of algorithmic systems it is essential to have a reporting standard so that information about an algorithmic system can be shared. This reporting standard describes how information about the different phases of an algorithm's life cycle can be reported. It contains, among other things, descriptive information combined with information about the technical tests and assessments applied.</p> <p>Disclaimer</p> <p>The TAD Reporting Standard is work in progress. This means that the current standard is probably suboptimal and will change significantly in future versions.</p>"},{"location":"projects/tad/reporting-standard/#introduction","title":"Introduction","text":"<p>Inspired by Model Cards for Model Reporting and Papers with Code Model Index this standard almost<sup>1</sup> <sup>2</sup> <sup>3</sup> <sup>4</sup> extends the Hugging Face model card metadata specification to allow for:</p> <ol> <li>More fine-grained information on performance metrics, by extending the <code>metrics_field</code> from the Hugging    Face metadata specification.</li> <li>Capturing additional measurements on fairness and bias, which can be partitioned into bar plot like    measurements (such as mean absolute SHAP values) and graph plot like measurements (such as partial dependence). This    is achieved by defining a new field <code>measurements</code>.</li> <li>Capturing assessments (such as    IAMA    and ALTAI).    This is achieved by defining a new field <code>assessments</code>.</li> </ol> <p>Following Hugging Face, this proposed standard will be written in YAML.</p> <p>This standard does not contain all fields present in the Hugging Face metadata specification. The fields that are optional in the Hugging Face specification and are specific to the Hugging Face interface are omitted.</p> <p>Another difference is that we divide our implementation into three separate parts.</p> <ol> <li><code>system_card</code>, containing information about a group of ML-models which accomplish a specific task.</li> <li><code>model_card</code>, containing information about a specific data science model.</li> <li><code>assessment_card</code>, containing information about a regulatory assessment.</li> </ol> <p>Include statements</p> <p>These <code>model_card</code>s and  <code>assessment_card</code>s  can be included verbatim into a <code>system_card</code>, or referenced with an <code>!include</code> statement, allowing for minimal cards to be compact in a single file. Extensive cards can be split up for readability and maintainability. Our standard allows for the <code>!include</code> to be used anywhere.</p>"},{"location":"projects/tad/reporting-standard/#specification-of-the-standard","title":"Specification of the standard","text":"<p>The standard will be written in YAML. Example YAML files are given in the next section. The standard defines three cards: a <code>system_card</code>, a <code>model_card</code> and an <code>assessment_card</code>. A <code>system_card</code> contains information about an algorithmic system. It can have multiple models and each of these models should have a <code>model_card</code>. Regulatory assessments can be processed in an <code>assessment_card</code>. Note that <code>model_card</code>'s and <code>assessment_card</code>'s can be included directly into the <code>system_card</code> or can be included as separate YAML files with help of a YAML-include mechanism. For clarity the latter is preferred and is also used in the examples in the next section.</p>"},{"location":"projects/tad/reporting-standard/#system_card","title":"<code>system_card</code>","text":"<p>A <code>system_card</code> contains the following information.</p> <ol> <li><code>schema_version</code> (REQUIRED, string). Version of the schema used, for example \"0.1a2\".</li> <li> <p><code>provenance</code> (OPTIONAL). In case this System Card is generated from another source file, this field can capture the    historical context of the contents of this System Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used     to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.     Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in     ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>name</code> (OPTIONAL, string). Name used to describe the system.</p> </li> <li><code>upl</code> (OPTIONAL, string). If this algorithm is part of a product offered by the Dutch Government,     it should contain a URI from the     Uniform Product List.</li> <li> <p><code>owners</code> (OPTIONAL, list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the    Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this    field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is    set.</li> </ol> </li> <li> <p><code>description</code> (OPTIONAL, string). A short description of the system.</p> </li> <li> <p><code>labels</code> (OPTIONAL, list). This fields allows to store meta information about a system. There can be multiple labels.    For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the label.</li> <li><code>value</code> (OPTIONAL, string). Value of the label.</li> </ol> </li> <li> <p><code>status</code> (OPTIONAL, string). The status of the system. For example the status can be \"production\".</p> </li> <li><code>publication_category</code> (OPTIONAL, enum[string]). The publication category of the algorithm should be chosen from    <code>[\"high_risk\", other\"]</code>.</li> <li><code>begin_date</code> (OPTIONAL, string). The first date the system was used. Date should be given in     ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>end_date</code> (OPTIONAL, string). The last date the system was used. Date should be given in     ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>goal_and_impact</code> (OPTIONAL, string). The purpose of the system and the impact it has on citizens and companies.</li> <li><code>considerations</code> (OPTIONAL, string). The pro's and con's of using the system.</li> <li><code>risk_management</code> (OPTIONAL, string). Description of the risks associated with the system.</li> <li><code>human_intervention</code> (OPTIONAL, string). A description to want extend there is human involvement in the system.</li> <li> <p><code>legal_base</code> (OPTIONAL, list). If there exists a legal base for the process the system is embedded in, this field     can be filled in with the relevant laws. There can be multiple legal bases. For each legal base the following fields     are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the law.</li> <li><code>link</code> (OPTIONAL, string). URI pointing towards the contents of the law.</li> </ol> </li> <li> <p><code>used_data</code> (OPTIONAL, string). An overview of the data that is used in the system.</p> </li> <li><code>technical_design</code> (OPTIONAL, string). Description on how the system works.</li> <li> <p><code>external_providers</code> (OPTIONAL, list). If relevant, these fields allow to store information on external providers.     There can be multiple external providers.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the external provider.</li> <li><code>version</code> (OPTIONAL, string). Version of the external provider reflecting its relation to previous versions.</li> </ol> </li> <li> <p><code>references</code> (OPTIONAL, list[string]). Additional reference URI's that point information about the system and are     relevant.</p> </li> <li><code>interaction_details</code> (OPTIONAL, list[string]). Explain how the AI system interacts with hardware or software,     including other AI systems, or how the AI system can be used to interact with hardware or software.</li> <li><code>version_requirements</code> (OPTIONAL, list[string]). Describe the versions of the relevant software or firmware, and any     requirements related to version updates.</li> <li><code>deployment_variants</code> (OPTIONAL, list[string]). Description of all the forms in which the AI system is placed on the     market or put into service, such as software packages embedded into hardware, downloads, or APIs.</li> <li><code>hardware_requirements</code> (OPTIONAL, list[string]). Provide a description of the hardware on which the AI system must     be run.</li> <li><code>product_markings</code> (OPTIONAL, list[string]). If the AI system is a component of products, photos, or illustrations,     describe the external features, markings, and internal layout of those products.</li> <li> <p><code>user_interface</code> (OPTIONAL, list). Provide information on the user interface provided to the user responsible for     its operation.</p> <ol> <li><code>description</code> (OPTIONAL, string). A description of the provided user interface.</li> <li><code>link</code> (OPTIONAL, string). A link to the user interface can be included.</li> <li><code>snapshot</code> (OPTIONAL, string). A snapshot/screenshot of the user interface can be included with the use of a     hyperlink.</li> </ol> </li> <li> <p><code>models</code> (OPTIONAL, list[ModelCard]). A list of model cards (as defined below) or <code>!include</code>s of a YAML file     containing a model card. This model card can for example be a model card described in the next section or a model     card from Hugging Face. There can be multiple model cards, meaning multiple models are used.</p> </li> <li> <p><code>assessments</code> (OPTIONAL, list[AssessmentCard]). A list of assessment cards (as defined below) or <code>!include</code>s of a     YAML file containing a assessment card. This assessment card is an assessment card described in the next section.     There can be multiple assessment cards, meaning multiple assessment were performed.</p> </li> </ol>"},{"location":"projects/tad/reporting-standard/#model_card","title":"<code>model_card</code>","text":"<p>A <code>model_card</code> contains the following information.</p> <ol> <li> <p><code>provenance</code> (OPTIONAL). In case this Model Card is generated from another source file, this field can capture the    historical context of the contents of this Model Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used     to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.    Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in    ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>language</code> (OPTIONAL, list[string]). If relevant, the natural languages the model supports in    ISO 639. There can be multiple languages.</p> </li> <li> <p><code>license</code> (REQUIRED).</p> <ol> <li><code>license_name</code> (REQUIRED, string). Any license from the    open source license list<sup>1</sup>. If the license is NOT present in the license list    this field must be set to 'other' and the following two fields will be REQUIRED.</li> <li><code>license_link</code> (OPTIONAL, string). A link to a file of that name inside the repo, or a URL to a remote file    containing the license contents.</li> </ol> </li> <li> <p><code>tags</code> (OPTIONAL, list[string]). Tags with keywords to describe the project. There can be multiple tags.</p> </li> <li> <p><code>owners</code> (OPTIONAL, list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the    Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this    field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is     set.</li> </ol> </li> <li> <p><code>model_index</code> (REQUIRED, list). There can be multiple models. For each model the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the model.</li> <li><code>model</code> (REQUIRED, string). A URI pointing to a repository containing the model file.</li> <li> <p><code>artifacts</code> (OPTIONAL, list). A list of artifacts</p> <ol> <li><code>uri</code> (OPTIONAL, string) URI refers to a relevant model artifact</li> <li><code>content-type</code> (OPTIONAL, string) Optional type, follow the    Content-Type. Recognized values are    \"application/onnx\"\", to refer to an ONNX representation of the model.</li> <li><code>md5-checksum</code> (OPTIONAL, string) Optional checksum for the content of the file.</li> </ol> </li> <li> <p><code>parameters</code> (OPTIONAL, list). There can be multiple parameters. For each parameter the following fields are    present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the parameter, for example \"epochs\".</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the parameter, for example \"int\".</li> <li><code>value</code> (OPTIONAL, string). The value of the parameter, for example 100.</li> <li> <p><code>labels</code> (OPTIONAL, list). This field allows to store meta information about a parameter. There can be    multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the label.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED.</li> </ol> </li> </ol> </li> <li> <p><code>results</code> (OPTIONAL, list). There can be multiple results. For each result the following fields are present.</p> <ol> <li> <p><code>task</code> (OPTIONAL, list).</p> <ol> <li><code>task_type</code> (REQUIRED, string). The task of the model, for example \"object-classification\".</li> <li><code>task_name</code> (OPTIONAL, string). A pretty name for the model tasks, for example \"Object Classification\".</li> </ol> </li> <li> <p><code>datasets</code> (OPTIONAL, list). There can be multiple datasets <sup>2</sup>. For each dataset the following fields are    present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the dataset, can be a dataset id from    Hugging Face datasets or any other link to a repository containing the    dataset<sup>3</sup>, for example \"common_voice\".</li> <li><code>name</code> (REQUIRED, string). Name pretty name for the dataset, for example \"Common Voice (French)\".</li> <li><code>split</code> (OPTIONAL, string). The split of the dataset, for example \"train\".</li> <li><code>features</code> (OPTIONAL, list[string]). List of feature names.</li> <li><code>revision</code> (OPTIONAL, string). Version of the dataset, for example    \"5503434ddd753f426f4b38109466949a1217c2bb\".</li> </ol> </li> <li> <p><code>metrics</code> (OPTIONAL, list). There can be multiple metrics. For each metric the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). A metric-id from Hugging Face metrics<sup>4</sup>, for    example accuracy.</li> <li><code>name</code> (REQUIRED, string). A descriptive name of the metric. For example \"false positive rate\" is not a    descriptive name, but \"training false positive rate w.r.t class x\" is.</li> <li><code>dtype</code> (REQUIRED, string). The data type of the metric, for example <code>float</code>.</li> <li><code>value</code> (REQUIRED, string). The value of the metric.</li> <li> <p><code>labels</code> (OPTIONAL, list). This field allows to store meta information about a metric. For example,    metrics can be computed for example on subgroups of specific features. For example, one can compute the    accuracy for examples where the feature \"gender\" is set to \"male\". There can be multiple subgroups, which    means that the metric is computed on the intersection of those subgroups. There can be multiple labels.    For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the feature. For example: \"gender\".</li> <li><code>type</code> (OPTIONAL, string). The type of the label. Can for example be set to \"feature\" or    \"output_class\". If <code>name</code> is set, this field is REQUIRED.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature, for example <code>float</code>. If <code>name</code> is set, this    field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED. For    example: \"male\".</li> </ol> </li> </ol> </li> <li> <p><code>measurements</code>.</p> <ol> <li> <p><code>bar_plots</code> (OPTIONAL, list). The purpose of this field is to capture bar plot like measurements, for     example SHAP values. There can be multiple bar plots. For each bar plot the following fields are     present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of bar plot, for example \"SHAP\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name for the plot, for example \"Mean Absolute SHAP Values\".</li> <li> <p><code>results</code> (REQUIRED, list). The contents of the bar plot. A result represents a bar. There can be    multiple results. For each result the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of bar.</li> <li><code>value</code> (REQUIRED, float). The value of the corresponding bar.</li> </ol> </li> </ol> </li> <li> <p><code>graph_plots</code> (OPTIONAL, list). The purpose of this field is to capture graph plot like measurements,    such as partial dependence plots. There can be multiple graph plots. For each graph plot the following    fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the graph plot, for example \"partial_dependence\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name of the graph, for example \"Partial Dependence Plot\".</li> <li> <p><code>results</code> (REQUIRED, list). Results contains the graph plot data. Each graph can depend on a specific    output class and feature. There can be multiple results. For each result the following fields are    present.</p> <ol> <li><code>class</code> (OPTIONAL, string/int/float/bool). The output class name that the graph corresponds to.    This field is not always present.</li> <li><code>feature</code> (REQUIRED, string). The feature the graph corresponds to. This is required, since all    relevant graphs are dependent on features.</li> <li> <p><code>data</code> (REQUIRED, list)</p> <ol> <li><code>x_value</code> (REQUIRED, float). The $x$-value of the graph.</li> <li><code>y_value</code> (REQUIRED, float). The $y$-value of the graph.</li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/#assessment_card","title":"<code>assessment_card</code>","text":"<p>An <code>assessment_card</code> contains the following information.</p> <ol> <li> <p><code>provenance</code> (OPTIONAL). In case this Assessment Card is generated from another source file, this field can capture    the historical context of the contents of this Assessment Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used    to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.    Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in    ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>name</code> (REQUIRED, string). The name of the assessment.</p> </li> <li><code>urn</code> (OPTIONAL, string). A Uniform Resource Name (URN) of the instrument in the instrument register.</li> <li><code>date</code> (REQUIRED, string). The date at which the assessment is completed. Date should be given in    ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li> <p><code>contents</code> (REQUIRED, list). There can be multiple items in contents. For each item the following fields are present:</p> <ol> <li><code>question</code> (REQUIRED, string). A question.</li> <li><code>urn</code> (OPTIONAL, string). A Uniform Resource Name (URN) of the corresponding task in the instrument register.</li> <li><code>answer</code> (REQUIRED, string). An answer.</li> <li><code>remarks</code> (OPTIONAL, string). A field to put relevant discussion remarks in.</li> <li> <p><code>authors</code> (OPTIONAL, list). There can be multiple names. For each name the following field is present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the author of the question.</li> </ol> </li> <li> <p><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of the answer. Timestamp should be     given, preferably in UTC (represented as <code>Z</code>), in     ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</p> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/#example","title":"Example","text":""},{"location":"projects/tad/reporting-standard/#system-card","title":"System Card","text":"<pre><code>version: {system_card_version}\nprovenance:\n  git_commit_hash: {git_commit_hash}\n  timestamp: {modification_timestamp}\n  uri: {modification_uri}\n  author: {modification_author}\nname: {system_name}\nupl: {upl_uri}\nowners:\n  - oin: {oin}\n    organization: {organization_name}\n    name: {owner_name}\n    email: {owner_email}\n    role: {owner_role}\ndescription: {system_description}\nlabels:\n  - name: {label_name}\n    value: {label_value}\nstatus: {system_status}\npublication_category: {system_publication_cat}\nbegin_date: {system_begin_date}\nend_date: {system_end_date}\ngoal_and_impact: {system_goal_and_impact}\nconsiderations: {system_considerations}\nrisk_management: {system_risk_management}\nhuman_intervention: {system_human_intervention}\nlegal_base:\n  - name: {law_name}\n    link: {law_uri}\nused_data: {system_used_data}\ntechnical_design: {technical_design}\nexternal_providers:\n  - name: {name_external_provider}\n    version: {version_external_provider}\nreferences:\n  - {reference_uri}\ninteraction_details:\n  - {system_interaction_details}\nversion_requirements:\n  - {system_version_requirements}\ndeployment_variants:\n  - {system_deployment_variants}\nhardware_requirements:\n  - {system_hardware_requirements}\nproduct_markings:\n  - {system_product_markings}\nuser_interface:\n  - description: {system_user_interface}\n    link: {system_user_interface_uri}\n    snapshot: {system_user_interface_snapshot_uri}\n\nmodels:\n  - !include {model_card_uri}\n\nassessments:\n  - !include {assessment_card_uri}\n</code></pre>"},{"location":"projects/tad/reporting-standard/#model-card","title":"Model Card","text":"<pre><code>provenance:\n  git_commit_hash: {git_commit_hash}\n  timestamp: {modification_timestamp}\n  uri: {modification_uri}\n  author: {modification_author}\nlanguage:\n  - {lang_0}\nlicense:\n  license_name: {license_name}\n  license_link: {license_uri}\ntags:\n  - {tag_0}\nowners:\n  - oin: {oin}\n    organization: {organization_name}\n    name: {owner_name}\n    email: {owner_email}\n    role: {owner_role}\n\nmodel-index:\n  - name: {model_id}\n    model: {model_uri}\n    artifacts:\n      - uri: {model_artifact_uri}\n      - content-type: {model_artifact_type}\n      - md5-checksum: {md5_checksum}\n    parameters:\n      - name: {parameter_name}\n        dtype: {parameter_dtype}\n        value: {parameter_value}\n        labels:\n          - name: {label_name}\n            dtype: {label_type}\n            value: {label_value}\n    results:\n      - task:\n          - type: {task_type}\n            name: {task_name}\n        datasets:\n          - type: {dataset_type}\n            name: {dataset_name}\n            split: {split}\n            features:\n              - {feature_name}\n            revision: {dataset_version}\n        metrics:\n          - type: {metric_type}\n            name: {metric_name}\n            dtype: {metric_dtype}\n            value: {metric_value}\n            labels:\n              - name: {label_name}\n                type: {label_type}\n                dtype: {label_type}\n                value: {label_value}\n        measurements:\n          bar_plots:\n            - type: {measurement_type}\n              name: {measurement_name}\n              results:\n                - name: {bar_name}\n                  value: {bar_value}\n          graph_plots:\n            - type: {measurement_type}\n              name: {measurement_name}\n              results:\n                - class: {class_name}\n                  feature: {feature_name}\n                  data:\n                    - x_value: {x_value}\n                      y_value: {y_value}\n</code></pre>"},{"location":"projects/tad/reporting-standard/#assessment-card","title":"Assessment Card","text":"<pre><code>provenance:\n  git_commit_hash: {git_commit_hash}\n  timestamp: {modification_timestamp}\n  uri: {modification_uri}\n  author: {modification_author}\nname: {assessment_name}\nurn: {urn}\ndate: {assessment_date}\ncontents:\n  - question: {question_text}\n    urn: {urn}\n    answer: {answer_text}\n    remarks: {remarks_text}\n    authors:\n      - name: {author_name}\n    timestamp: {timestamp}\n</code></pre>"},{"location":"projects/tad/reporting-standard/#schema","title":"Schema","text":"<p>JSON schema will be added when we publish the first beta version.</p>"},{"location":"projects/tad/reporting-standard/#changelog","title":"Changelog","text":"<ul> <li>0.1a7: adds urn to assessment card</li> <li>0.1a6:<ul> <li>fix mismatches between description and examples</li> <li>format YAML examples and Markdown formatting</li> </ul> </li> <li>0.1a5: adds a general description of the technical documentation required for high-risk systems to conform to the EU   AI Act.</li> <li>0.1a4: adds data provenance</li> <li>0.1a3: require ISO 8601 timestamp</li> <li>0.1a2: introduces typed artifacts</li> <li>0.1a1: initial draft version of this reporting standard</li> </ul> <ol> <li> <p>Deviation from the Hugging Face specification is in the License field. Hugging Face only accepts dataset id's   from Hugging Face license list while we accept any   license from Open Source License List.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the <code>model_index:results:dataset</code> field. Hugging Face only   accepts one dataset, while we accept a list of datasets.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the Dataset Type field. Hugging Face only accepts dataset id's   from Hugging Face datasets while we also allow for any url pointing to the dataset.\u00a0\u21a9\u21a9</p> </li> <li> <p>For this extension to work relevant metrics (such as for example false positive rate) have to be added to the   Hugging Face metrics, possibly this can be done in our organizational namespace.\u00a0\u21a9\u21a9</p> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a1/","title":"0.1a1","text":"<p>This document describes the Transparency of Algorithmic Decision making (TAD) Reporting Standard.</p> <p>For reproducibility, governance, auditing and sharing of algorithmic systems it is essential to have a reporting standard so that information about an algorithmic system can be shared. This reporting standard describes how information about the different phases of an algorithm's life cycle can be reported. It contains, among other things, descriptive information combined with information about the technical tests and assessments applied.</p> <p>Disclaimer</p> <p>The TAD Reporting Standard is work in progress. This means that the current standard is probably suboptimal and will change significantly in future versions.</p>"},{"location":"projects/tad/reporting-standard/0.1a1/#introduction","title":"Introduction","text":"<p>Inspired by Model Cards for Model Reporting and Papers with Code Model Index this standard almost <sup>1</sup> <sup>2</sup> <sup>3</sup> <sup>4</sup> extends the Hugging Face model card metadata specification to allow for:</p> <ol> <li>More fine-grained information on performance metrics, by extending the <code>metrics_field</code> from the Hugging Face metadata specification.</li> <li>Capturing additional measurements on fairness and bias, which can be partitioned into bar plot like measurements (such as mean absolute SHAP values) and graph plot like measurements (such as partial dependence). This is achieved by defining a new field <code>measurements</code>.</li> <li>Capturing assessments (such as IAMA and ALTAI). This is achieved by defining a new field <code>assessments</code>.</li> </ol> <p>Following Hugging Face, this proposed standard will be written in yaml.</p> <p>This standard does not contain all fields present in the Hugging Face metadata specification. The fields that are optional in the Hugging Face specification and are specific to the Hugging Face interface are omitted.</p> <p>Another difference is that we divide our implementation into three separate parts.</p> <ol> <li><code>system_card</code>, containing information about a group of ML-models which accomplish a specific task.</li> <li><code>model_card</code>, containing information about a specific data science model.</li> <li><code>assessment_card</code>, containing information about a regulatory assessment.</li> </ol> <p>Include statements</p> <p>These <code>model_card</code>s and  <code>assessment_card</code>s  can be included verbatim into a <code>system_card</code>, or referenced with an <code>!include</code> statement, allowing for minimal cards to be compact in a single file. Extensive cards can be split up for readability and maintainability. Our standard allows for the <code>!include</code> to be used anywhere.</p>"},{"location":"projects/tad/reporting-standard/0.1a1/#specification-of-the-standard","title":"Specification of the standard","text":"<p>The standard will be written in yaml. Example yaml files are given in the next section. The standard defines three cards: a <code>system_card</code>, a <code>model_card</code> and an <code>assessment_card</code>. A <code>system_card</code> contains information about an algorithmic system. It can have multiple models and each of these models should have a <code>model_card</code>. Regulatory assessments can be processed in an <code>assessment_card</code>. Note that <code>model_card</code>'s and <code>assessment_card</code>'s can be included directly into the <code>system_card</code> or can be included as separate yaml files with help of a yaml-include mechanism. For clarity the latter is preferred and is also used in the examples in the next section.</p>"},{"location":"projects/tad/reporting-standard/0.1a1/#system_card","title":"<code>system_card</code>","text":"<p>A <code>system_card</code> contains the following information.</p> <ol> <li><code>schema_version</code> (REQUIRED, string). Version of the schema used, for example \"0.1a1\".</li> <li><code>name</code> (OPTIONAL, string). Name used to describe the system.</li> <li><code>upl</code> (OPTIONAL, string). If this algorithm is part of a product offered by the Dutch Government,     it should contain a URI from the Uniform Product List.</li> <li> <p><code>owners</code> (list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is set.</li> </ol> </li> <li> <p><code>description</code> (OPTIONAL, string). A short description of the system.</p> </li> <li> <p><code>labels</code> (OPTIONAL, list). This fields allows to store meta information about a system. There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the label.</li> <li><code>value</code> (OPTIONAL, string). Value of the label.</li> </ol> </li> <li> <p><code>status</code> (OPTIONAL, string). The status of the system. For example the status can be \"production\".</p> </li> <li><code>publication_category</code> (OPTIONAL, enum[string]). The publication category of the algorithm should be chosen from <code>[\"high_risk\", other\"]</code>.</li> <li><code>begin_date</code> (OPTIONAL, string). The first date the system was used. Date should be given in ISO 8601 format, i.e. YYYY-MM-DD.</li> <li><code>end_date</code> (OPTIONAL, string). The last date the system was used. Date should be given in ISO 8601 format, i.e. YYYY-MM-DD.</li> <li><code>goal_and_impact</code> (OPTIONAL, string). The purpose of the system and the impact it has on citizens and companies.</li> <li><code>considerations</code> (OPTIONAL, string). The pro's and con's of using the system.</li> <li><code>risk_management</code> (OPTIONAL, string). Description of the risks associated with the system.</li> <li><code>human_intervention</code> (OPTIONAL, string). A description to want extend there is human involvement in the system.</li> <li><code>legal_base</code> (OPTIONAL, list). If there exists a legal base for the process the system is embedded in, this field can be filled in with the relevant laws. There can be multiple legal bases. For each legal base the following fields are present.<ol> <li><code>name</code> (OPTIONAL, string). Name of the law.</li> <li><code>link</code> (OPTIONAL, string). URI pointing towards the contents of the law.</li> </ol> </li> <li><code>used_data</code> (OPTIONAL, string). An overview of the data that is used in the system.</li> <li><code>technical_design</code> (OPTIONAL, string). Description on how the system works.</li> <li><code>external_providers</code> (OPTIONAL, list[string]). Name of an external provider, if relevant. There can be multiple external providers.</li> <li><code>references</code> (OPTIONAL, list[string]). Additional reference URI's that point information about the system and are relevant.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a1/#1-models","title":"1. Models","text":"<ol> <li><code>models</code> (OPTIONAL, list[ModelCard]). A list of model cards (as defined below) or <code>!include</code>s of a yaml file containing a model card. This model card can for example be a model card described in the next section or a model card from Hugging Face. There can be multiple model cards, meaning multiple models are used.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a1/#2-assessments","title":"2. Assessments","text":"<ol> <li><code>assessments</code> (OPTIONAL, list[AssessmentCard]). A list of assessment cards (as defined below) or <code>!include</code>s of a yaml file containing a assessment card. This assessment card is an assessment card described in the next section. There can be multiple assessment cards, meaning multiple assessment were performed.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a1/#model_card","title":"<code>model_card</code>","text":"<p>A <code>model_card</code> contains the following information.</p> <ol> <li><code>language</code> (OPTIONAL, list[string]). If relevant, the natural languages the model supports in ISO 639.     There can be multiple languages.</li> <li> <p><code>license</code>(REQUIRED, string). Any license from the open source license list <sup>1</sup>. If the license is NOT present in the license list this field must be set to 'other' and the following two fields will be REQUIRED.</p> <ol> <li><code>license_name</code> (string). An id for the license.</li> <li><code>license_link</code> (string). A link to a file of that name inside the repo, or a URL to a remote file containing the license contents.</li> </ol> </li> <li> <p><code>tags</code> (OPTIONAL, list[string]). Tags with keywords to describe the project. There can be multiple tags.</p> </li> <li> <p><code>owners</code> (list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is set.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a1/#1-model-index","title":"1. Model Index","text":"<p>There can be multiple models. For each model the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the model.</li> <li><code>model</code> (REQUIRED, string). A URI pointing to a repository containing the model file.</li> <li><code>artifacts</code> (OPTIONAL, list[string]). A list of URI's where each URI refers to a relevant model artifact, that cannot be captured by any other field, but are relevant to model.</li> <li> <p><code>parameters</code> (list). There can be multiple parameters. For each parameter the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the parameter, for example \"epochs\".</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the parameter, for example \"int\".</li> <li><code>value</code> (OPTIONAL, string). The value of the parameter, for example 100.</li> <li> <p><code>labels</code> (list). This field allows to store meta information about a parameter.     There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the label.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED.</li> </ol> </li> </ol> </li> <li> <p><code>results</code> (list). There can be multiple results. For each result the following fields are present.</p> <ol> <li> <p><code>task</code> (OPTIONAL, list).</p> <ol> <li><code>task_type</code> (REQUIRED, string). The task of the model, for example \"object-classification\".</li> <li><code>task_name</code> (OPTIONAL, string). A pretty name for the model tasks, for example \"Object Classification\".</li> </ol> </li> <li> <p><code>datasets</code> (list). There can be multiple datasets <sup>2</sup>. For each dataset the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the dataset, can be a dataset id from Hugging Face datasets or any other link to a repository containing the dataset<sup>3</sup>, for example \"common_voice\".</li> <li><code>name</code> (REQUIRED, string). Name pretty name for the dataset, for example \"Common Voice (French)\".</li> <li><code>split</code> (OPTIONAL, string). The split of the dataset, for example \"train\".</li> <li><code>features</code> (OPTIONAL, list[string]). List of feature names.</li> <li><code>revision</code> (OPTIONAL, string). Version of the dataset, for example 5503434ddd753f426f4b38109466949a1217c2bb.</li> </ol> </li> <li> <p><code>metrics</code> (list). There can be multiple metrics. For each metric the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). A metric-id from Hugging Face metrics<sup>4</sup>, for example accuracy.</li> <li><code>name</code> (REQUIRED, string). A descriptive name of the metric. For example \"false positive rate\" is not a descriptive name, but \"training false positive rate w.r.t class x\" is.</li> <li><code>dtype</code> (REQUIRED, string). The data type of the metric, for example <code>float</code>.</li> <li><code>value</code> (REQUIRED, string). The value of the metric.</li> <li> <p><code>labels</code> (list). This field allows to store meta information about a metric. For example, metrics can be computed for example on subgroups of specific features. For example, one can compute the accuracy for examples where the feature \"gender\" is set to \"male\". There can be multiple subgroups, which means that the metric is computed on the intersection of those subgroups. There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the feature. For example: \"gender\".</li> <li><code>type</code> (OPTIONAL, string). The type of the label. Can for example be set to \"feature\" or \"output_class\". If <code>name</code> is set, this field is REQUIRED.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature, for example <code>float</code>. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED. For example: \"male\".</li> </ol> </li> </ol> </li> <li> <p><code>measurements</code>.</p> <ol> <li> <p><code>bar_plots</code> (list). The purpose of this field is to capture bar plot like measurements, for example SHAP values. There can be multiple bar plots. For each bar plot the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of bar plot, for example \"SHAP\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name for the plot, for example \"Mean Absolute SHAP Values\".</li> <li><code>results</code> (list). The contents of the bar plot. A result represents a bar. There can be multiple results. For each result the following fields are present.<ol> <li><code>name</code> (REQUIRED, string). The name of bar.</li> <li><code>value</code> (REQUIRED, float). The value of the corresponding bar.</li> </ol> </li> </ol> </li> <li> <p><code>graph_plots</code> (list). The purpose of this field is to capture graph plot like measurements, such as partial dependence plots. There can be multiple graph plots. For each graph plot the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the graph plot, for example \"partial_dependence\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name of the graph, for example \"Partial Dependence Plot\".</li> <li><code>results</code> (list). Results contains the graph plot data. Each graph can depend on a specific output class and feature. There can be multiple results. For each result the following fields are present.<ol> <li><code>class</code> (OPTIONAL, string/int/float/bool). The output class name that the graph corresponds to. This field is not always present.</li> <li><code>feature</code> (REQUIRED, string). The feature the graph corresponds to. This is required, since all relevant graphs are dependent on features.</li> <li><code>data</code> (list)<ol> <li><code>x_value</code> (REQUIRED, float). The $x$-value of the graph.</li> <li><code>y_value</code> (REQUIRED, float). The $y$-value of the graph.</li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a1/#assessment_card","title":"<code>assessment_card</code>","text":"<p>An <code>assessment_card</code> contains the following information.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the assessment.</li> <li><code>date</code> (REQUIRED, string). The date at which the assessment is completed. Date should be given in ISO 8601 format, i.e. YYYY-MM-DD.</li> <li> <p><code>contents</code> (list). There can be multiple items in contents. For each item the following fields are present:</p> <ol> <li><code>question</code> (REQUIRED, string). A question.</li> <li><code>answer</code> (REQUIRED, string). An answer.</li> <li><code>remarks</code> (OPTIONAL, string). A field to put relevant discussion remarks in.</li> <li><code>authors</code>. There can be multiple names. For each name the following field is present.<ol> <li><code>name</code> (OPTIONAL, string). The name of the author of the question.</li> </ol> </li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date and time of the answer.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a1/#example","title":"Example","text":""},{"location":"projects/tad/reporting-standard/0.1a1/#system-card","title":"System Card","text":"<pre><code>version: {system_card_version}                          # Optional. Example: \"0.1a1\"\nname: {system_name}                                     # Optional. Example: \"AangifteVertrekBuitenland\"\nupl: {upl_uri}                                          # Optional. Example: https://standaarden.overheid.nl/owms/terms/AangifteVertrekBuitenland\nowners:\n- oin: {oin}                                            # Optional. Example: 00000001003214345000\n  organization: {organization_name}                     # Optional if oin is provided, Required otherwise. Example: BZK\n  name: {owner_name}                                    # Optional. Example: John Doe\n  email: {owner_email}                                  # Optional. Example: johndoe@email.com\n  role: {owner_role}                                    # Optional. Example: Data Scientist.\ndescription: {system_description}                       # Optional. Short description of the system.\nlabels:                                                 # Optional. Labels to store metadata about the system.\n- name: {label_name}                                    # Optional.\n  value: {label_value}                                  # Optional.\nstatus: {system_status}                                 # Optional. Example \"production\".\npublication_category: {system_publication_cat}          # Optional. Example: \"impactful_algorithm\".\nbegin_date: {system_begin_date}                         # Optional. Example: 2025-1-1.\nend_date: {system_end_date}                             # Optional. Example: 2025-12-1.\ngoal_and_impact: {system_goal_and_impact}               # Optional. Goal and impact of the system.\nconsiderations: {system_considerations}                 # Optional. Considerations about the system.\nrisk_management: {system_risk_management}               # Optional. Description of risks associated with the system.\nhuman_intervention: {system_human_intervention}         # Optional. Description of human involvement in the system.\nlegal_base:\n- name: {law_name}                                      # Optional. Example: \"AVG\".\n  link: {law_uri}                                       # Optional. Example: \"https://eur-lex.europa.eu/legal-content/NL/TXT/HTML/?uri=CELEX:31995L0046\".\nused_data: {system_used_data}                           # Optional. Description of the data used by the system.\ntechnical_design: {technical_design}                    # Optional. Description of the technical design of the system.\nexternal_providers:\n- {system_external_provider}                            # Optional. Reference to used external providers.\nreferences:\n- {reference_uri}                                       # Optional. Example: URI to codebase.\n\nmodels:\n- !include {model_card_uri}                             # Optional. Example: cat_classifier_model.yaml.\n\nassessments:\n- !include {assessment_card_uri}                        # Required. Example: iama.yaml.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a1/#model-card","title":"Model Card","text":"<pre><code>language:\n  - {lang_0}                                            # Optional. Example nl.\nlicense: {license}                                      # Required. Example: Apache-2.0 or any license SPDX ID from https://opensource.org/license or \"other\".\nlicense_name: {license_name}                            # Optional if license != other, Required otherwise. Example: 'my-license-1.0'\nlicense_link: {license_link}                            # Optional if license != other, Required otherwise. Specify \"LICENSE\" or \"LICENSE.md\" to link to a file of that name inside the repo, or a URL to a remote file.\ntags:\n- {tag_0}                                               # Optional. Example: audio\n- {tag_1}                                               # Optional. Example: automatic-speech-recognition\nowners:\n- organization: {organization_name}                     # Required. Example: BZK\n  oin: {oin}                                            # Optional. Example: 00000001003214345000\n  name: {owner_name}                                    # Optional. Example: John Doe\n  email: {owner_email}                                  # Optional. Example: johndoe@email.com\n  role: {owner_role}                                    # Optional. Example: Data Scientist.\n\nmodel-index:\n- name: {model_id}                                      # Required. Example: CatClassifier.\n  model: {model_uri}                                    # Required. URI to a repository containing the model file.\n  artifacts:\n  - {model_artifact}                                    # Optional. URI to relevant model artifacts, if applicable.\n  parameters:\n  - name: {parameter_name}                              # Optional. Example: \"epochs\".\n    dtype: {parameter_dtype}                            # Optional. Example: \"int\".\n    value: {parameter_value}                            # Optional. Example: 100.\n    labels:\n      - name: {label_name}                              # Optional. Example: \"gender\".\n        dtype: {label_type}                             # Optional. Example: \"string\".\n        value: {label_value}                            # Optional. Example: \"female\".\n  results:\n  - task:\n      type: {task_type}                                 # Required. Example: image-classification.\n      name: {task_name}                                 # Optional. Example: Image Classification.\n    datasets:\n      - type: {dataset_type}                            # Required. Example: common_voice. Link to a repository containing the dataset\n        name: {dataset_name}                            # Required. Example: \"Common Voice (French)\". A pretty name for the dataset.\n        split: {split}                                  # Optional. Example: \"train\".\n        features:\n         - {feature_name}                               # Optional. Example: \"gender\".\n        revision: {dataset_version}                     # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n    metrics:\n    - type: {metric_type}                               # Required. Example: false-positive-rate. Use metric id from https://hf.co/metrics.\n      name: {metric_name}                               # Required. Example: \"FPR wrt class 0 restricted to feature gender:0 and age:21\".\n      dtype: {metric_dtype}                             # Required. Example: \"float\".\n      value: {metric_value}                             # Required. Example: 0.75.\n      labels:\n        - name: {label_name}                            # Optional. Example: \"gender\".\n          type: {label_type}                            # Optional. Example: \"feature\".\n          dtype: {label_type}                           # Optional. Example: \"string\".\n          value: {label_value}                          # Optional. Example: \"female\".\n    measurements:\n      # Bar plots should be able to capture SHAP and Robustness Toolbox from AI Verify.\n      bar_plots:\n      - type: {measurement_type}                        # Required. Example: \"SHAP\".\n        name: {measurement_name}                        # Optional. Example: \"Mean Absolute Shap Values\".\n        results:\n        - name: {bar_name}                              # Required. The name of a bar.\n          value: {bar_value}                            # Required. The corresponding value.\n      # Graph plots should be able to capture graph based measurements such as partial dependence and accumulated local effect.\n      graph_plots:\n      - type: {measurement_type}                        # Required. Example: \"partial_dependence\".\n        name: {measurement_name}                        # Optional. Example: \"Partial Dependence Plot\".\n        # Results store the graph plot data. So far all plots are dependent on a combination of a specific class (sometimes) and feature (always).\n        # For example partial dependence plots are made for each feature and class.\n        results:\n         - class: {class_name}                          # Optional. Name of the output class the graph depends on.\n           feature: {feature_name}                      # Required. Name of the feature the graph depends on.\n           data:\n            - x_value: {x_value}                        # Required. The x value of the graph data.\n              y_value: {y_value}                        # Required. The y value of the graph data.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a1/#assessment-card","title":"Assessment Card","text":"<pre><code>name: {assessment_name}                               # Required. Example: IAMA.\ndate: {assessment_date}                               # Required. Example: 25-03-2025.\ncontents:\n  - question: {question_text}                         # Required. Example: \"Question 1: ...\".\n    answer: {answer_text}                             # Required. Example: \"Answer: ...\".\n    remarks: {remarks_text}                           # Optional. Example: \"Remarks: ...\".\n    authors:                                          # Optional. Example: \"['John', 'Peter']\".\n      - name: {author_name}\n    timestamp: {timestamp}                            # Optional. Example: 1711630721.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a1/#schema","title":"Schema","text":"<p>JSON schema will be added when we publish the first beta version.</p> <ol> <li> <p>Deviation from the Hugging Face specification is in the License field. Hugging Face only accepts dataset id's from Hugging Face license list while we accept any license from Open Source License List.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the <code>model_index:results:dataset</code> field. Hugging Face only accepts one dataset, while we accept a list of datasets.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the Dataset Type field. Hugging Face only accepts dataset id's from Hugging Face datasets while we also allow for any url pointing to the dataset.\u00a0\u21a9\u21a9</p> </li> <li> <p>For this extension to work relevant metrics (such as for example false positive rate) have to be added to the Hugging Face metrics, possibly this can be done in our organizational namespace.\u00a0\u21a9\u21a9</p> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a2/","title":"0.1a2","text":"<p>This document describes the Transparency of Algorithmic Decision making (TAD) Reporting Standard.</p> <p>For reproducibility, governance, auditing and sharing of algorithmic systems it is essential to have a reporting standard so that information about an algorithmic system can be shared. This reporting standard describes how information about the different phases of an algorithm's life cycle can be reported. It contains, among other things, descriptive information combined with information about the technical tests and assessments applied.</p> <p>Disclaimer</p> <p>The TAD Reporting Standard is work in progress. This means that the current standard is probably suboptimal and will change significantly in future versions.</p>"},{"location":"projects/tad/reporting-standard/0.1a2/#introduction","title":"Introduction","text":"<p>Inspired by Model Cards for Model Reporting and Papers with Code Model Index this standard almost <sup>1</sup> <sup>2</sup> <sup>3</sup> <sup>4</sup> extends the Hugging Face model card metadata specification to allow for:</p> <ol> <li>More fine-grained information on performance metrics, by extending the <code>metrics_field</code> from the Hugging Face metadata specification.</li> <li>Capturing additional measurements on fairness and bias, which can be partitioned into bar plot like measurements (such as mean absolute SHAP values) and graph plot like measurements (such as partial dependence). This is achieved by defining a new field <code>measurements</code>.</li> <li>Capturing assessments (such as IAMA and ALTAI). This is achieved by defining a new field <code>assessments</code>.</li> </ol> <p>Following Hugging Face, this proposed standard will be written in yaml.</p> <p>This standard does not contain all fields present in the Hugging Face metadata specification. The fields that are optional in the Hugging Face specification and are specific to the Hugging Face interface are omitted.</p> <p>Another difference is that we divide our implementation into three separate parts.</p> <ol> <li><code>system_card</code>, containing information about a group of ML-models which accomplish a specific task.</li> <li><code>model_card</code>, containing information about a specific data science model.</li> <li><code>assessment_card</code>, containing information about a regulatory assessment.</li> </ol> <p>Include statements</p> <p>These <code>model_card</code>s and  <code>assessment_card</code>s  can be included verbatim into a <code>system_card</code>, or referenced with an <code>!include</code> statement, allowing for minimal cards to be compact in a single file. Extensive cards can be split up for readability and maintainability. Our standard allows for the <code>!include</code> to be used anywhere.</p>"},{"location":"projects/tad/reporting-standard/0.1a2/#specification-of-the-standard","title":"Specification of the standard","text":"<p>The standard will be written in yaml. Example yaml files are given in the next section. The standard defines three cards: a <code>system_card</code>, a <code>model_card</code> and an <code>assessment_card</code>. A <code>system_card</code> contains information about an algorithmic system. It can have multiple models and each of these models should have a <code>model_card</code>. Regulatory assessments can be processed in an <code>assessment_card</code>. Note that <code>model_card</code>'s and <code>assessment_card</code>'s can be included directly into the <code>system_card</code> or can be included as separate yaml files with help of a yaml-include mechanism. For clarity the latter is preferred and is also used in the examples in the next section.</p>"},{"location":"projects/tad/reporting-standard/0.1a2/#system_card","title":"<code>system_card</code>","text":"<p>A <code>system_card</code> contains the following information.</p> <ol> <li><code>schema_version</code> (REQUIRED, string). Version of the schema used, for example \"0.1a2\".</li> <li><code>name</code> (OPTIONAL, string). Name used to describe the system.</li> <li><code>upl</code> (OPTIONAL, string). If this algorithm is part of a product offered by the Dutch Government,     it should contain a URI from the Uniform Product List.</li> <li> <p><code>owners</code> (list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is set.</li> </ol> </li> <li> <p><code>description</code> (OPTIONAL, string). A short description of the system.</p> </li> <li> <p><code>labels</code> (OPTIONAL, list). This fields allows to store meta information about a system. There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the label.</li> <li><code>value</code> (OPTIONAL, string). Value of the label.</li> </ol> </li> <li> <p><code>status</code> (OPTIONAL, string). The status of the system. For example the status can be \"production\".</p> </li> <li><code>publication_category</code> (OPTIONAL, enum[string]). The publication category of the algorithm should be chosen from <code>[\"high_risk\", other\"]</code>.</li> <li><code>begin_date</code> (OPTIONAL, string). The first date the system was used. Date should be given in ISO 8601 format, i.e. YYYY-MM-DD.</li> <li><code>end_date</code> (OPTIONAL, string). The last date the system was used. Date should be given in ISO 8601 format, i.e. YYYY-MM-DD.</li> <li><code>goal_and_impact</code> (OPTIONAL, string). The purpose of the system and the impact it has on citizens and companies.</li> <li><code>considerations</code> (OPTIONAL, string). The pro's and con's of using the system.</li> <li><code>risk_management</code> (OPTIONAL, string). Description of the risks associated with the system.</li> <li><code>human_intervention</code> (OPTIONAL, string). A description to want extend there is human involvement in the system.</li> <li><code>legal_base</code> (OPTIONAL, list). If there exists a legal base for the process the system is embedded in, this field can be filled in with the relevant laws. There can be multiple legal bases. For each legal base the following fields are present.<ol> <li><code>name</code> (OPTIONAL, string). Name of the law.</li> <li><code>link</code> (OPTIONAL, string). URI pointing towards the contents of the law.</li> </ol> </li> <li><code>used_data</code> (OPTIONAL, string). An overview of the data that is used in the system.</li> <li><code>technical_design</code> (OPTIONAL, string). Description on how the system works.</li> <li><code>external_providers</code> (OPTIONAL, list[string]). Name of an external provider, if relevant. There can be multiple external providers.</li> <li><code>references</code> (OPTIONAL, list[string]). Additional reference URI's that point information about the system and are relevant.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a2/#1-models","title":"1. Models","text":"<ol> <li><code>models</code> (OPTIONAL, list[ModelCard]). A list of model cards (as defined below) or <code>!include</code>s of a yaml file containing a model card. This model card can for example be a model card described in the next section or a model card from Hugging Face. There can be multiple model cards, meaning multiple models are used.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a2/#2-assessments","title":"2. Assessments","text":"<ol> <li><code>assessments</code> (OPTIONAL, list[AssessmentCard]). A list of assessment cards (as defined below) or <code>!include</code>s of a yaml file containing a assessment card. This assessment card is an assessment card described in the next section. There can be multiple assessment cards, meaning multiple assessment were performed.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a2/#model_card","title":"<code>model_card</code>","text":"<p>A <code>model_card</code> contains the following information.</p> <ol> <li><code>language</code> (OPTIONAL, list[string]). If relevant, the natural languages the model supports in ISO 639.     There can be multiple languages.</li> <li> <p><code>license</code>(REQUIRED, string). Any license from the open source license list <sup>1</sup>. If the license is NOT present in the license list this field must be set to 'other' and the following two fields will be REQUIRED.</p> <ol> <li><code>license_name</code> (string). An id for the license.</li> <li><code>license_link</code> (string). A link to a file of that name inside the repo, or a URL to a remote file containing the license contents.</li> </ol> </li> <li> <p><code>tags</code> (OPTIONAL, list[string]). Tags with keywords to describe the project. There can be multiple tags.</p> </li> <li> <p><code>owners</code> (list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is set.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a2/#1-model-index","title":"1. Model Index","text":"<p>There can be multiple models. For each model the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the model.</li> <li><code>model</code> (REQUIRED, string). A URI pointing to a repository containing the model file.</li> <li> <p><code>artifacts</code> (OPTIONAL, list). A list of artifacts</p> <ol> <li><code>uri</code> (OPTIONAL, string) URI refers to a relevant model artifact</li> <li><code>content-type</code> (OPTIONAL, string) Optional type, follow the Content-Type. Recognized values are \"application/onnx\"\", to refer to an ONNX representation of the model.</li> <li><code>md5-checksum</code> (OPTIONAL, string) Optional checksum for the content of the file.</li> </ol> </li> <li> <p><code>parameters</code> (list). There can be multiple parameters. For each parameter the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the parameter, for example \"epochs\".</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the parameter, for example \"int\".</li> <li><code>value</code> (OPTIONAL, string). The value of the parameter, for example 100.</li> <li> <p><code>labels</code> (list). This field allows to store meta information about a parameter.     There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the label.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED.</li> </ol> </li> </ol> </li> <li> <p><code>results</code> (list). There can be multiple results. For each result the following fields are present.</p> <ol> <li> <p><code>task</code> (OPTIONAL, list).</p> <ol> <li><code>task_type</code> (REQUIRED, string). The task of the model, for example \"object-classification\".</li> <li><code>task_name</code> (OPTIONAL, string). A pretty name for the model tasks, for example \"Object Classification\".</li> </ol> </li> <li> <p><code>datasets</code> (list). There can be multiple datasets <sup>2</sup>. For each dataset the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the dataset, can be a dataset id from Hugging Face datasets or any other link to a repository containing the dataset<sup>3</sup>, for example \"common_voice\".</li> <li><code>name</code> (REQUIRED, string). Name pretty name for the dataset, for example \"Common Voice (French)\".</li> <li><code>split</code> (OPTIONAL, string). The split of the dataset, for example \"train\".</li> <li><code>features</code> (OPTIONAL, list[string]). List of feature names.</li> <li><code>revision</code> (OPTIONAL, string). Version of the dataset, for example 5503434ddd753f426f4b38109466949a1217c2bb.</li> </ol> </li> <li> <p><code>metrics</code> (list). There can be multiple metrics. For each metric the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). A metric-id from Hugging Face metrics<sup>4</sup>, for example accuracy.</li> <li><code>name</code> (REQUIRED, string). A descriptive name of the metric. For example \"false positive rate\" is not a descriptive name, but \"training false positive rate w.r.t class x\" is.</li> <li><code>dtype</code> (REQUIRED, string). The data type of the metric, for example <code>float</code>.</li> <li><code>value</code> (REQUIRED, string). The value of the metric.</li> <li> <p><code>labels</code> (list). This field allows to store meta information about a metric. For example, metrics can be computed for example on subgroups of specific features. For example, one can compute the accuracy for examples where the feature \"gender\" is set to \"male\". There can be multiple subgroups, which means that the metric is computed on the intersection of those subgroups. There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the feature. For example: \"gender\".</li> <li><code>type</code> (OPTIONAL, string). The type of the label. Can for example be set to \"feature\" or \"output_class\". If <code>name</code> is set, this field is REQUIRED.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature, for example <code>float</code>. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED. For example: \"male\".</li> </ol> </li> </ol> </li> <li> <p><code>measurements</code>.</p> <ol> <li> <p><code>bar_plots</code> (list). The purpose of this field is to capture bar plot like measurements, for example SHAP values. There can be multiple bar plots. For each bar plot the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of bar plot, for example \"SHAP\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name for the plot, for example \"Mean Absolute SHAP Values\".</li> <li><code>results</code> (list). The contents of the bar plot. A result represents a bar. There can be multiple results. For each result the following fields are present.<ol> <li><code>name</code> (REQUIRED, string). The name of bar.</li> <li><code>value</code> (REQUIRED, float). The value of the corresponding bar.</li> </ol> </li> </ol> </li> <li> <p><code>graph_plots</code> (list). The purpose of this field is to capture graph plot like measurements, such as partial dependence plots. There can be multiple graph plots. For each graph plot the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the graph plot, for example \"partial_dependence\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name of the graph, for example \"Partial Dependence Plot\".</li> <li><code>results</code> (list). Results contains the graph plot data. Each graph can depend on a specific output class and feature. There can be multiple results. For each result the following fields are present.<ol> <li><code>class</code> (OPTIONAL, string/int/float/bool). The output class name that the graph corresponds to. This field is not always present.</li> <li><code>feature</code> (REQUIRED, string). The feature the graph corresponds to. This is required, since all relevant graphs are dependent on features.</li> <li><code>data</code> (list)<ol> <li><code>x_value</code> (REQUIRED, float). The $x$-value of the graph.</li> <li><code>y_value</code> (REQUIRED, float). The $y$-value of the graph.</li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a2/#assessment_card","title":"<code>assessment_card</code>","text":"<p>An <code>assessment_card</code> contains the following information.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the assessment.</li> <li><code>date</code> (REQUIRED, string). The date at which the assessment is completed. Date should be given in ISO 8601 format, i.e. YYYY-MM-DD.</li> <li> <p><code>contents</code> (list). There can be multiple items in contents. For each item the following fields are present:</p> <ol> <li><code>question</code> (REQUIRED, string). A question.</li> <li><code>answer</code> (REQUIRED, string). An answer.</li> <li><code>remarks</code> (OPTIONAL, string). A field to put relevant discussion remarks in.</li> <li><code>authors</code>. There can be multiple names. For each name the following field is present.<ol> <li><code>name</code> (OPTIONAL, string). The name of the author of the question.</li> </ol> </li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date and time of the answer.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a2/#example","title":"Example","text":""},{"location":"projects/tad/reporting-standard/0.1a2/#system-card","title":"System Card","text":"<pre><code>version: {system_card_version}                          # Optional. Example: \"0.1a1\"\nname: {system_name}                                     # Optional. Example: \"AangifteVertrekBuitenland\"\nupl: {upl_uri}                                          # Optional. Example: https://standaarden.overheid.nl/owms/terms/AangifteVertrekBuitenland\nowners:\n- oin: {oin}                                            # Optional. Example: 00000001003214345000\n  organization: {organization_name}                     # Optional if oin is provided, Required otherwise. Example: BZK\n  name: {owner_name}                                    # Optional. Example: John Doe\n  email: {owner_email}                                  # Optional. Example: johndoe@email.com\n  role: {owner_role}                                    # Optional. Example: Data Scientist.\ndescription: {system_description}                       # Optional. Short description of the system.\nlabels:                                                 # Optional. Labels to store metadata about the system.\n- name: {label_name}                                    # Optional.\n  value: {label_value}                                  # Optional.\nstatus: {system_status}                                 # Optional. Example: \"production\".\npublication_category: {system_publication_cat}          # Optional. Example: \"impactful_algorithm\".\nbegin_date: {system_begin_date}                         # Optional. Example: 2025-1-1.\nend_date: {system_end_date}                             # Optional. Example: 2025-12-1.\ngoal_and_impact: {system_goal_and_impact}               # Optional. Goal and impact of the system.\nconsiderations: {system_considerations}                 # Optional. Considerations about the system.\nrisk_management: {system_risk_management}               # Optional. Description of risks associated with the system.\nhuman_intervention: {system_human_intervention}         # Optional. Description of human involvement in the system.\nlegal_base:\n- name: {law_name}                                      # Optional. Example: \"AVG\".\n  link: {law_uri}                                       # Optional. Example: \"https://eur-lex.europa.eu/legal-content/NL/TXT/HTML/?uri=CELEX:31995L0046\".\nused_data: {system_used_data}                           # Optional. Description of the data used by the system.\ntechnical_design: {technical_design}                    # Optional. Description of the technical design of the system.\nexternal_providers:\n- {system_external_provider}                            # Optional. Reference to used external providers.\nreferences:\n- {reference_uri}                                       # Optional. Example: URI to codebase.\n\nmodels:\n- !include {model_card_uri}                             # Optional. Example: cat_classifier_model.yaml.\n\nassessments:\n- !include {assessment_card_uri}                        # Required. Example: iama.yaml.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a2/#model-card","title":"Model Card","text":"<pre><code>language:\n  - {lang_0}                                            # Optional. Example nl.\nlicense: {license}                                      # Required. Example: Apache-2.0 or any license SPDX ID from https://opensource.org/license or \"other\".\nlicense_name: {license_name}                            # Optional if license != other, Required otherwise. Example: 'my-license-1.0'\nlicense_link: {license_link}                            # Optional if license != other, Required otherwise. Specify \"LICENSE\" or \"LICENSE.md\" to link to a file of that name inside the repo, or a URL to a remote file.\ntags:\n- {tag_0}                                               # Optional. Example: audio\n- {tag_1}                                               # Optional. Example: automatic-speech-recognition\nowners:\n- organization: {organization_name}                     # Required. Example: BZK\n  oin: {oin}                                            # Optional. Example: 00000001003214345000\n  name: {owner_name}                                    # Optional. Example: John Doe\n  email: {owner_email}                                  # Optional. Example: johndoe@email.com\n  role: {owner_role}                                    # Optional. Example: Data Scientist.\n\nmodel-index:\n- name: {model_id}                                      # Required. Example: CatClassifier.\n  model: {model_uri}                                    # Required. URI to a repository containing the model file.\n  artifacts:\n  - uri: {model_artifact_uri}                           # Optional. Example: \"https://github.com/MinBZK/poc-kijkdoos-wasm-models/raw/main/logres_iris/logreg_iris.onnx\"\n  - content-type: {model_artifact_type}                 # Optional. Example: \"application/onnx\".\n  - md5-checksum: {md5_checksum}                        # Optional. Example: \"120EA8A25E5D487BF68B5F7096440019\"\n  parameters:\n  - name: {parameter_name}                              # Optional. Example: \"epochs\".\n    dtype: {parameter_dtype}                            # Optional. Example: \"int\".\n    value: {parameter_value}                            # Optional. Example: 100.\n    labels:\n      - name: {label_name}                              # Optional. Example: \"gender\".\n        dtype: {label_type}                             # Optional. Example: \"string\".\n        value: {label_value}                            # Optional. Example: \"female\".\n  results:\n  - task:\n      type: {task_type}                                 # Required. Example: image-classification.\n      name: {task_name}                                 # Optional. Example: Image Classification.\n    datasets:\n      - type: {dataset_type}                            # Required. Example: common_voice. Link to a repository containing the dataset\n        name: {dataset_name}                            # Required. Example: \"Common Voice (French)\". A pretty name for the dataset.\n        split: {split}                                  # Optional. Example: \"train\".\n        features:\n         - {feature_name}                               # Optional. Example: \"gender\".\n        revision: {dataset_version}                     # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n    metrics:\n    - type: {metric_type}                               # Required. Example: false-positive-rate. Use metric id from https://hf.co/metrics.\n      name: {metric_name}                               # Required. Example: \"FPR wrt class 0 restricted to feature gender:0 and age:21\".\n      dtype: {metric_dtype}                             # Required. Example: \"float\".\n      value: {metric_value}                             # Required. Example: 0.75.\n      labels:\n        - name: {label_name}                            # Optional. Example: \"gender\".\n          type: {label_type}                            # Optional. Example: \"feature\".\n          dtype: {label_type}                           # Optional. Example: \"string\".\n          value: {label_value}                          # Optional. Example: \"female\".\n    measurements:\n      # Bar plots should be able to capture SHAP and Robustness Toolbox from AI Verify.\n      bar_plots:\n      - type: {measurement_type}                        # Required. Example: \"SHAP\".\n        name: {measurement_name}                        # Optional. Example: \"Mean Absolute Shap Values\".\n        results:\n        - name: {bar_name}                              # Required. The name of a bar.\n          value: {bar_value}                            # Required. The corresponding value.\n      # Graph plots should be able to capture graph based measurements such as partial dependence and accumulated local effect.\n      graph_plots:\n      - type: {measurement_type}                        # Required. Example: \"partial_dependence\".\n        name: {measurement_name}                        # Optional. Example: \"Partial Dependence Plot\".\n        # Results store the graph plot data. So far all plots are dependent on a combination of a specific class (sometimes) and feature (always).\n        # For example partial dependence plots are made for each feature and class.\n        results:\n         - class: {class_name}                          # Optional. Name of the output class the graph depends on.\n           feature: {feature_name}                      # Required. Name of the feature the graph depends on.\n           data:\n            - x_value: {x_value}                        # Required. The x value of the graph data.\n              y_value: {y_value}                        # Required. The y value of the graph data.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a2/#assessment-card","title":"Assessment Card","text":"<pre><code>name: {assessment_name}                               # Required. Example: IAMA.\ndate: {assessment_date}                               # Required. Example: 25-03-2025.\ncontents:\n  - question: {question_text}                         # Required. Example: \"Question 1: ...\".\n    answer: {answer_text}                             # Required. Example: \"Answer: ...\".\n    remarks: {remarks_text}                           # Optional. Example: \"Remarks: ...\".\n    authors:                                          # Optional. Example: \"['John', 'Peter']\".\n      - name: {author_name}\n    timestamp: {timestamp}                            # Optional. Example: 1711630721.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a2/#schema","title":"Schema","text":"<p>JSON schema will be added when we publish the first beta version.</p>"},{"location":"projects/tad/reporting-standard/0.1a2/#changelog","title":"Changelog","text":"<ul> <li>0.1a2: introduces typed artifacts</li> <li>0.1a1: initial draft version of this reporting standard</li> </ul> <ol> <li> <p>Deviation from the Hugging Face specification is in the License field. Hugging Face only accepts dataset id's from Hugging Face license list while we accept any license from Open Source License List.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the <code>model_index:results:dataset</code> field. Hugging Face only accepts one dataset, while we accept a list of datasets.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the Dataset Type field. Hugging Face only accepts dataset id's from Hugging Face datasets while we also allow for any url pointing to the dataset.\u00a0\u21a9\u21a9</p> </li> <li> <p>For this extension to work relevant metrics (such as for example false positive rate) have to be added to the Hugging Face metrics, possibly this can be done in our organizational namespace.\u00a0\u21a9\u21a9</p> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a3/","title":"0.1a3","text":"<p>This document describes the Transparency of Algorithmic Decision making (TAD) Reporting Standard.</p> <p>For reproducibility, governance, auditing and sharing of algorithmic systems it is essential to have a reporting standard so that information about an algorithmic system can be shared. This reporting standard describes how information about the different phases of an algorithm's life cycle can be reported. It contains, among other things, descriptive information combined with information about the technical tests and assessments applied.</p> <p>Disclaimer</p> <p>The TAD Reporting Standard is work in progress. This means that the current standard is probably suboptimal and will change significantly in future versions.</p>"},{"location":"projects/tad/reporting-standard/0.1a3/#introduction","title":"Introduction","text":"<p>Inspired by Model Cards for Model Reporting and Papers with Code Model Index this standard almost <sup>1</sup> <sup>2</sup> <sup>3</sup> <sup>4</sup> extends the Hugging Face model card metadata specification to allow for:</p> <ol> <li>More fine-grained information on performance metrics, by extending the <code>metrics_field</code> from the Hugging Face metadata specification.</li> <li>Capturing additional measurements on fairness and bias, which can be partitioned into bar plot like measurements (such as mean absolute SHAP values) and graph plot like measurements (such as partial dependence). This is achieved by defining a new field <code>measurements</code>.</li> <li>Capturing assessments (such as IAMA and ALTAI). This is achieved by defining a new field <code>assessments</code>.</li> </ol> <p>Following Hugging Face, this proposed standard will be written in yaml.</p> <p>This standard does not contain all fields present in the Hugging Face metadata specification. The fields that are optional in the Hugging Face specification and are specific to the Hugging Face interface are omitted.</p> <p>Another difference is that we divide our implementation into three separate parts.</p> <ol> <li><code>system_card</code>, containing information about a group of ML-models which accomplish a specific task.</li> <li><code>model_card</code>, containing information about a specific data science model.</li> <li><code>assessment_card</code>, containing information about a regulatory assessment.</li> </ol> <p>Include statements</p> <p>These <code>model_card</code>s and  <code>assessment_card</code>s  can be included verbatim into a <code>system_card</code>, or referenced with an <code>!include</code> statement, allowing for minimal cards to be compact in a single file. Extensive cards can be split up for readability and maintainability. Our standard allows for the <code>!include</code> to be used anywhere.</p>"},{"location":"projects/tad/reporting-standard/0.1a3/#specification-of-the-standard","title":"Specification of the standard","text":"<p>The standard will be written in yaml. Example yaml files are given in the next section. The standard defines three cards: a <code>system_card</code>, a <code>model_card</code> and an <code>assessment_card</code>. A <code>system_card</code> contains information about an algorithmic system. It can have multiple models and each of these models should have a <code>model_card</code>. Regulatory assessments can be processed in an <code>assessment_card</code>. Note that <code>model_card</code>'s and <code>assessment_card</code>'s can be included directly into the <code>system_card</code> or can be included as separate yaml files with help of a yaml-include mechanism. For clarity the latter is preferred and is also used in the examples in the next section.</p>"},{"location":"projects/tad/reporting-standard/0.1a3/#system_card","title":"<code>system_card</code>","text":"<p>A <code>system_card</code> contains the following information.</p> <ol> <li><code>schema_version</code> (REQUIRED, string). Version of the schema used, for example \"0.1a2\".</li> <li><code>name</code> (OPTIONAL, string). Name used to describe the system.</li> <li><code>upl</code> (OPTIONAL, string). If this algorithm is part of a product offered by the Dutch Government,     it should contain a URI from the Uniform Product List.</li> <li> <p><code>owners</code> (list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is set.</li> </ol> </li> <li> <p><code>description</code> (OPTIONAL, string). A short description of the system.</p> </li> <li> <p><code>labels</code> (OPTIONAL, list). This fields allows to store meta information about a system. There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the label.</li> <li><code>value</code> (OPTIONAL, string). Value of the label.</li> </ol> </li> <li> <p><code>status</code> (OPTIONAL, string). The status of the system. For example the status can be \"production\".</p> </li> <li><code>publication_category</code> (OPTIONAL, enum[string]). The publication category of the algorithm should be chosen from <code>[\"high_risk\", other\"]</code>.</li> <li><code>begin_date</code> (OPTIONAL, string). The first date the system was used. Date should be given in ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>end_date</code> (OPTIONAL, string). The last date the system was used. Date should be given in ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>goal_and_impact</code> (OPTIONAL, string). The purpose of the system and the impact it has on citizens and companies.</li> <li><code>considerations</code> (OPTIONAL, string). The pro's and con's of using the system.</li> <li><code>risk_management</code> (OPTIONAL, string). Description of the risks associated with the system.</li> <li><code>human_intervention</code> (OPTIONAL, string). A description to want extend there is human involvement in the system.</li> <li><code>legal_base</code> (OPTIONAL, list). If there exists a legal base for the process the system is embedded in, this field can be filled in with the relevant laws. There can be multiple legal bases. For each legal base the following fields are present.<ol> <li><code>name</code> (OPTIONAL, string). Name of the law.</li> <li><code>link</code> (OPTIONAL, string). URI pointing towards the contents of the law.</li> </ol> </li> <li><code>used_data</code> (OPTIONAL, string). An overview of the data that is used in the system.</li> <li><code>technical_design</code> (OPTIONAL, string). Description on how the system works.</li> <li><code>external_providers</code> (OPTIONAL, list[string]). Name of an external provider, if relevant. There can be multiple external providers.</li> <li><code>references</code> (OPTIONAL, list[string]). Additional reference URI's that point information about the system and are relevant.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a3/#1-models","title":"1. Models","text":"<ol> <li><code>models</code> (OPTIONAL, list[ModelCard]). A list of model cards (as defined below) or <code>!include</code>s of a yaml file containing a model card. This model card can for example be a model card described in the next section or a model card from Hugging Face. There can be multiple model cards, meaning multiple models are used.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a3/#2-assessments","title":"2. Assessments","text":"<ol> <li><code>assessments</code> (OPTIONAL, list[AssessmentCard]). A list of assessment cards (as defined below) or <code>!include</code>s of a yaml file containing a assessment card. This assessment card is an assessment card described in the next section. There can be multiple assessment cards, meaning multiple assessment were performed.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a3/#model_card","title":"<code>model_card</code>","text":"<p>A <code>model_card</code> contains the following information.</p> <ol> <li><code>language</code> (OPTIONAL, list[string]). If relevant, the natural languages the model supports in ISO 639.     There can be multiple languages.</li> <li> <p><code>license</code>(REQUIRED, string). Any license from the open source license list <sup>1</sup>. If the license is NOT present in the license list this field must be set to 'other' and the following two fields will be REQUIRED.</p> <ol> <li><code>license_name</code> (string). An id for the license.</li> <li><code>license_link</code> (string). A link to a file of that name inside the repo, or a URL to a remote file containing the license contents.</li> </ol> </li> <li> <p><code>tags</code> (OPTIONAL, list[string]). Tags with keywords to describe the project. There can be multiple tags.</p> </li> <li> <p><code>owners</code> (list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is set.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a3/#1-model-index","title":"1. Model Index","text":"<p>There can be multiple models. For each model the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the model.</li> <li><code>model</code> (REQUIRED, string). A URI pointing to a repository containing the model file.</li> <li> <p><code>artifacts</code> (OPTIONAL, list). A list of artifacts</p> <ol> <li><code>uri</code> (OPTIONAL, string) URI refers to a relevant model artifact</li> <li><code>content-type</code> (OPTIONAL, string) Optional type, follow the Content-Type. Recognized values are \"application/onnx\"\", to refer to an ONNX representation of the model.</li> <li><code>md5-checksum</code> (OPTIONAL, string) Optional checksum for the content of the file.</li> </ol> </li> <li> <p><code>parameters</code> (list). There can be multiple parameters. For each parameter the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the parameter, for example \"epochs\".</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the parameter, for example \"int\".</li> <li><code>value</code> (OPTIONAL, string). The value of the parameter, for example 100.</li> <li> <p><code>labels</code> (list). This field allows to store meta information about a parameter.     There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the label.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED.</li> </ol> </li> </ol> </li> <li> <p><code>results</code> (list). There can be multiple results. For each result the following fields are present.</p> <ol> <li> <p><code>task</code> (OPTIONAL, list).</p> <ol> <li><code>task_type</code> (REQUIRED, string). The task of the model, for example \"object-classification\".</li> <li><code>task_name</code> (OPTIONAL, string). A pretty name for the model tasks, for example \"Object Classification\".</li> </ol> </li> <li> <p><code>datasets</code> (list). There can be multiple datasets <sup>2</sup>. For each dataset the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the dataset, can be a dataset id from Hugging Face datasets or any other link to a repository containing the dataset<sup>3</sup>, for example \"common_voice\".</li> <li><code>name</code> (REQUIRED, string). Name pretty name for the dataset, for example \"Common Voice (French)\".</li> <li><code>split</code> (OPTIONAL, string). The split of the dataset, for example \"train\".</li> <li><code>features</code> (OPTIONAL, list[string]). List of feature names.</li> <li><code>revision</code> (OPTIONAL, string). Version of the dataset, for example 5503434ddd753f426f4b38109466949a1217c2bb.</li> </ol> </li> <li> <p><code>metrics</code> (list). There can be multiple metrics. For each metric the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). A metric-id from Hugging Face metrics<sup>4</sup>, for example accuracy.</li> <li><code>name</code> (REQUIRED, string). A descriptive name of the metric. For example \"false positive rate\" is not a descriptive name, but \"training false positive rate w.r.t class x\" is.</li> <li><code>dtype</code> (REQUIRED, string). The data type of the metric, for example <code>float</code>.</li> <li><code>value</code> (REQUIRED, string). The value of the metric.</li> <li> <p><code>labels</code> (list). This field allows to store meta information about a metric. For example, metrics can be computed for example on subgroups of specific features. For example, one can compute the accuracy for examples where the feature \"gender\" is set to \"male\". There can be multiple subgroups, which means that the metric is computed on the intersection of those subgroups. There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the feature. For example: \"gender\".</li> <li><code>type</code> (OPTIONAL, string). The type of the label. Can for example be set to \"feature\" or \"output_class\". If <code>name</code> is set, this field is REQUIRED.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature, for example <code>float</code>. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED. For example: \"male\".</li> </ol> </li> </ol> </li> <li> <p><code>measurements</code>.</p> <ol> <li> <p><code>bar_plots</code> (list). The purpose of this field is to capture bar plot like measurements, for example SHAP values. There can be multiple bar plots. For each bar plot the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of bar plot, for example \"SHAP\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name for the plot, for example \"Mean Absolute SHAP Values\".</li> <li><code>results</code> (list). The contents of the bar plot. A result represents a bar. There can be multiple results. For each result the following fields are present.<ol> <li><code>name</code> (REQUIRED, string). The name of bar.</li> <li><code>value</code> (REQUIRED, float). The value of the corresponding bar.</li> </ol> </li> </ol> </li> <li> <p><code>graph_plots</code> (list). The purpose of this field is to capture graph plot like measurements, such as partial dependence plots. There can be multiple graph plots. For each graph plot the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the graph plot, for example \"partial_dependence\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name of the graph, for example \"Partial Dependence Plot\".</li> <li><code>results</code> (list). Results contains the graph plot data. Each graph can depend on a specific output class and feature. There can be multiple results. For each result the following fields are present.<ol> <li><code>class</code> (OPTIONAL, string/int/float/bool). The output class name that the graph corresponds to. This field is not always present.</li> <li><code>feature</code> (REQUIRED, string). The feature the graph corresponds to. This is required, since all relevant graphs are dependent on features.</li> <li><code>data</code> (list)<ol> <li><code>x_value</code> (REQUIRED, float). The $x$-value of the graph.</li> <li><code>y_value</code> (REQUIRED, float). The $y$-value of the graph.</li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a3/#assessment_card","title":"<code>assessment_card</code>","text":"<p>An <code>assessment_card</code> contains the following information.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the assessment.</li> <li><code>date</code> (REQUIRED, string). The date at which the assessment is completed. Date should be given in ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li> <p><code>contents</code> (list). There can be multiple items in contents. For each item the following fields are present:</p> <ol> <li><code>question</code> (REQUIRED, string). A question.</li> <li><code>answer</code> (REQUIRED, string). An answer.</li> <li><code>remarks</code> (OPTIONAL, string). A field to put relevant discussion remarks in.</li> <li><code>authors</code>. There can be multiple names. For each name the following field is present.<ol> <li><code>name</code> (OPTIONAL, string). The name of the author of the question.</li> </ol> </li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of the answer. Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a3/#example","title":"Example","text":""},{"location":"projects/tad/reporting-standard/0.1a3/#system-card","title":"System Card","text":"<pre><code>version: {system_card_version}                          # Optional. Example: \"0.1a1\"\nname: {system_name}                                     # Optional. Example: \"AangifteVertrekBuitenland\"\nupl: {upl_uri}                                          # Optional. Example: https://standaarden.overheid.nl/owms/terms/AangifteVertrekBuitenland\nowners:\n- oin: {oin}                                            # Optional. Example: 00000001003214345000\n  organization: {organization_name}                     # Optional if oin is provided, Required otherwise. Example: BZK\n  name: {owner_name}                                    # Optional. Example: John Doe\n  email: {owner_email}                                  # Optional. Example: johndoe@email.com\n  role: {owner_role}                                    # Optional. Example: Data Scientist.\ndescription: {system_description}                       # Optional. Short description of the system.\nlabels:                                                 # Optional. Labels to store metadata about the system.\n- name: {label_name}                                    # Optional.\n  value: {label_value}                                  # Optional.\nstatus: {system_status}                                 # Optional. Example: \"production\".\npublication_category: {system_publication_cat}          # Optional. Example: \"impactful_algorithm\".\nbegin_date: {system_begin_date}                         # Optional. Example: 2025-1-1.\nend_date: {system_end_date}                             # Optional. Example: 2025-12-1.\ngoal_and_impact: {system_goal_and_impact}               # Optional. Goal and impact of the system.\nconsiderations: {system_considerations}                 # Optional. Considerations about the system.\nrisk_management: {system_risk_management}               # Optional. Description of risks associated with the system.\nhuman_intervention: {system_human_intervention}         # Optional. Description of human involvement in the system.\nlegal_base:\n- name: {law_name}                                      # Optional. Example: \"AVG\".\n  link: {law_uri}                                       # Optional. Example: \"https://eur-lex.europa.eu/legal-content/NL/TXT/HTML/?uri=CELEX:31995L0046\".\nused_data: {system_used_data}                           # Optional. Description of the data used by the system.\ntechnical_design: {technical_design}                    # Optional. Description of the technical design of the system.\nexternal_providers:\n- {system_external_provider}                            # Optional. Reference to used external providers.\nreferences:\n- {reference_uri}                                       # Optional. Example: URI to codebase.\n\nmodels:\n- !include {model_card_uri}                             # Optional. Example: cat_classifier_model.yaml.\n\nassessments:\n- !include {assessment_card_uri}                        # Required. Example: iama.yaml.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a3/#model-card","title":"Model Card","text":"<pre><code>language:\n  - {lang_0}                                            # Optional. Example nl.\nlicense: {license}                                      # Required. Example: Apache-2.0 or any license SPDX ID from https://opensource.org/license or \"other\".\nlicense_name: {license_name}                            # Optional if license != other, Required otherwise. Example: 'my-license-1.0'\nlicense_link: {license_link}                            # Optional if license != other, Required otherwise. Specify \"LICENSE\" or \"LICENSE.md\" to link to a file of that name inside the repo, or a URL to a remote file.\ntags:\n- {tag_0}                                               # Optional. Example: audio\n- {tag_1}                                               # Optional. Example: automatic-speech-recognition\nowners:\n- organization: {organization_name}                     # Required. Example: BZK\n  oin: {oin}                                            # Optional. Example: 00000001003214345000\n  name: {owner_name}                                    # Optional. Example: John Doe\n  email: {owner_email}                                  # Optional. Example: johndoe@email.com\n  role: {owner_role}                                    # Optional. Example: Data Scientist.\n\nmodel-index:\n- name: {model_id}                                      # Required. Example: CatClassifier.\n  model: {model_uri}                                    # Required. URI to a repository containing the model file.\n  artifacts:\n  - uri: {model_artifact_uri}                           # Optional. Example: \"https://github.com/MinBZK/poc-kijkdoos-wasm-models/raw/main/logres_iris/logreg_iris.onnx\"\n  - content-type: {model_artifact_type}                 # Optional. Example: \"application/onnx\".\n  - md5-checksum: {md5_checksum}                        # Optional. Example: \"120EA8A25E5D487BF68B5F7096440019\"\n  parameters:\n  - name: {parameter_name}                              # Optional. Example: \"epochs\".\n    dtype: {parameter_dtype}                            # Optional. Example: \"int\".\n    value: {parameter_value}                            # Optional. Example: 100.\n    labels:\n      - name: {label_name}                              # Optional. Example: \"gender\".\n        dtype: {label_type}                             # Optional. Example: \"string\".\n        value: {label_value}                            # Optional. Example: \"female\".\n  results:\n  - task:\n      type: {task_type}                                 # Required. Example: image-classification.\n      name: {task_name}                                 # Optional. Example: Image Classification.\n    datasets:\n      - type: {dataset_type}                            # Required. Example: common_voice. Link to a repository containing the dataset\n        name: {dataset_name}                            # Required. Example: \"Common Voice (French)\". A pretty name for the dataset.\n        split: {split}                                  # Optional. Example: \"train\".\n        features:\n         - {feature_name}                               # Optional. Example: \"gender\".\n        revision: {dataset_version}                     # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n    metrics:\n    - type: {metric_type}                               # Required. Example: false-positive-rate. Use metric id from https://hf.co/metrics.\n      name: {metric_name}                               # Required. Example: \"FPR wrt class 0 restricted to feature gender:0 and age:21\".\n      dtype: {metric_dtype}                             # Required. Example: \"float\".\n      value: {metric_value}                             # Required. Example: 0.75.\n      labels:\n        - name: {label_name}                            # Optional. Example: \"gender\".\n          type: {label_type}                            # Optional. Example: \"feature\".\n          dtype: {label_type}                           # Optional. Example: \"string\".\n          value: {label_value}                          # Optional. Example: \"female\".\n    measurements:\n      # Bar plots should be able to capture SHAP and Robustness Toolbox from AI Verify.\n      bar_plots:\n      - type: {measurement_type}                        # Required. Example: \"SHAP\".\n        name: {measurement_name}                        # Optional. Example: \"Mean Absolute Shap Values\".\n        results:\n        - name: {bar_name}                              # Required. The name of a bar.\n          value: {bar_value}                            # Required. The corresponding value.\n      # Graph plots should be able to capture graph based measurements such as partial dependence and accumulated local effect.\n      graph_plots:\n      - type: {measurement_type}                        # Required. Example: \"partial_dependence\".\n        name: {measurement_name}                        # Optional. Example: \"Partial Dependence Plot\".\n        # Results store the graph plot data. So far all plots are dependent on a combination of a specific class (sometimes) and feature (always).\n        # For example partial dependence plots are made for each feature and class.\n        results:\n         - class: {class_name}                          # Optional. Name of the output class the graph depends on.\n           feature: {feature_name}                      # Required. Name of the feature the graph depends on.\n           data:\n            - x_value: {x_value}                        # Required. The x value of the graph data.\n              y_value: {y_value}                        # Required. The y value of the graph data.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a3/#assessment-card","title":"Assessment Card","text":"<pre><code>name: {assessment_name}                               # Required. Example: IAMA.\ndate: {assessment_date}                               # Required. Example: 25-03-2025.\ncontents:\n  - question: {question_text}                         # Required. Example: \"Question 1: ...\".\n    answer: {answer_text}                             # Required. Example: \"Answer: ...\".\n    remarks: {remarks_text}                           # Optional. Example: \"Remarks: ...\".\n    authors:                                          # Optional. Example: \"['John', 'Peter']\".\n      - name: {author_name}\n    timestamp: {timestamp}                            # Optional. Example: 2024-04-16T16:48:14Z.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a3/#schema","title":"Schema","text":"<p>JSON schema will be added when we publish the first beta version.</p>"},{"location":"projects/tad/reporting-standard/0.1a3/#changelog","title":"Changelog","text":"<ul> <li>0.1a3: require ISO 8601 timestamp</li> <li>0.1a2: introduces typed artifacts</li> <li>0.1a1: initial draft version of this reporting standard</li> </ul> <ol> <li> <p>Deviation from the Hugging Face specification is in the License field. Hugging Face only accepts dataset id's from Hugging Face license list while we accept any license from Open Source License List.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the <code>model_index:results:dataset</code> field. Hugging Face only accepts one dataset, while we accept a list of datasets.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the Dataset Type field. Hugging Face only accepts dataset id's from Hugging Face datasets while we also allow for any url pointing to the dataset.\u00a0\u21a9\u21a9</p> </li> <li> <p>For this extension to work relevant metrics (such as for example false positive rate) have to be added to the Hugging Face metrics, possibly this can be done in our organizational namespace.\u00a0\u21a9\u21a9</p> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a4/","title":"0.1a4","text":"<p>This document describes the Transparency of Algorithmic Decision making (TAD) Reporting Standard.</p> <p>For reproducibility, governance, auditing and sharing of algorithmic systems it is essential to have a reporting standard so that information about an algorithmic system can be shared. This reporting standard describes how information about the different phases of an algorithm's life cycle can be reported. It contains, among other things, descriptive information combined with information about the technical tests and assessments applied.</p> <p>Disclaimer</p> <p>The TAD Reporting Standard is work in progress. This means that the current standard is probably suboptimal and will change significantly in future versions.</p>"},{"location":"projects/tad/reporting-standard/0.1a4/#introduction","title":"Introduction","text":"<p>Inspired by Model Cards for Model Reporting and Papers with Code Model Index this standard almost <sup>1</sup> <sup>2</sup> <sup>3</sup> <sup>4</sup> extends the Hugging Face model card metadata specification to allow for:</p> <ol> <li>More fine-grained information on performance metrics, by extending the <code>metrics_field</code> from the Hugging Face metadata specification.</li> <li>Capturing additional measurements on fairness and bias, which can be partitioned into bar plot like measurements (such as mean absolute SHAP values) and graph plot like measurements (such as partial dependence). This is achieved by defining a new field <code>measurements</code>.</li> <li>Capturing assessments (such as IAMA and ALTAI). This is achieved by defining a new field <code>assessments</code>.</li> </ol> <p>Following Hugging Face, this proposed standard will be written in yaml.</p> <p>This standard does not contain all fields present in the Hugging Face metadata specification. The fields that are optional in the Hugging Face specification and are specific to the Hugging Face interface are omitted.</p> <p>Another difference is that we divide our implementation into three separate parts.</p> <ol> <li><code>system_card</code>, containing information about a group of ML-models which accomplish a specific task.</li> <li><code>model_card</code>, containing information about a specific data science model.</li> <li><code>assessment_card</code>, containing information about a regulatory assessment.</li> </ol> <p>Include statements</p> <p>These <code>model_card</code>s and  <code>assessment_card</code>s  can be included verbatim into a <code>system_card</code>, or referenced with an <code>!include</code> statement, allowing for minimal cards to be compact in a single file. Extensive cards can be split up for readability and maintainability. Our standard allows for the <code>!include</code> to be used anywhere.</p>"},{"location":"projects/tad/reporting-standard/0.1a4/#specification-of-the-standard","title":"Specification of the standard","text":"<p>The standard will be written in yaml. Example yaml files are given in the next section. The standard defines three cards: a <code>system_card</code>, a <code>model_card</code> and an <code>assessment_card</code>. A <code>system_card</code> contains information about an algorithmic system. It can have multiple models and each of these models should have a <code>model_card</code>. Regulatory assessments can be processed in an <code>assessment_card</code>. Note that <code>model_card</code>'s and <code>assessment_card</code>'s can be included directly into the <code>system_card</code> or can be included as separate yaml files with help of a yaml-include mechanism. For clarity the latter is preferred and is also used in the examples in the next section.</p>"},{"location":"projects/tad/reporting-standard/0.1a4/#system_card","title":"<code>system_card</code>","text":"<p>A <code>system_card</code> contains the following information.</p> <ol> <li><code>schema_version</code> (REQUIRED, string). Version of the schema used, for example \"0.1a2\".</li> <li> <p><code>provenance</code> (OPTIONAL). In case this System Card is generated from another source file, this field can capture the historical context of the contents of this System Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card. Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>name</code> (OPTIONAL, string). Name used to describe the system.</p> </li> <li><code>upl</code> (OPTIONAL, string). If this algorithm is part of a product offered by the Dutch Government,     it should contain a URI from the Uniform Product List.</li> <li> <p><code>owners</code> (list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is set.</li> </ol> </li> <li> <p><code>description</code> (OPTIONAL, string). A short description of the system.</p> </li> <li> <p><code>labels</code> (OPTIONAL, list). This fields allows to store meta information about a system. There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the label.</li> <li><code>value</code> (OPTIONAL, string). Value of the label.</li> </ol> </li> <li> <p><code>status</code> (OPTIONAL, string). The status of the system. For example the status can be \"production\".</p> </li> <li><code>publication_category</code> (OPTIONAL, enum[string]). The publication category of the algorithm should be chosen from <code>[\"high_risk\", other\"]</code>.</li> <li><code>begin_date</code> (OPTIONAL, string). The first date the system was used. Date should be given in ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>end_date</code> (OPTIONAL, string). The last date the system was used. Date should be given in ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>goal_and_impact</code> (OPTIONAL, string). The purpose of the system and the impact it has on citizens and companies.</li> <li><code>considerations</code> (OPTIONAL, string). The pro's and con's of using the system.</li> <li><code>risk_management</code> (OPTIONAL, string). Description of the risks associated with the system.</li> <li><code>human_intervention</code> (OPTIONAL, string). A description to want extend there is human involvement in the system.</li> <li><code>legal_base</code> (OPTIONAL, list). If there exists a legal base for the process the system is embedded in, this field can be filled in with the relevant laws. There can be multiple legal bases. For each legal base the following fields are present.<ol> <li><code>name</code> (OPTIONAL, string). Name of the law.</li> <li><code>link</code> (OPTIONAL, string). URI pointing towards the contents of the law.</li> </ol> </li> <li><code>used_data</code> (OPTIONAL, string). An overview of the data that is used in the system.</li> <li><code>technical_design</code> (OPTIONAL, string). Description on how the system works.</li> <li><code>external_providers</code> (OPTIONAL, list[string]). Name of an external provider, if relevant. There can be multiple external providers.</li> <li><code>references</code> (OPTIONAL, list[string]). Additional reference URI's that point information about the system and are relevant.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a4/#1-models","title":"1. Models","text":"<ol> <li><code>models</code> (OPTIONAL, list[ModelCard]). A list of model cards (as defined below) or <code>!include</code>s of a yaml file containing a model card. This model card can for example be a model card described in the next section or a model card from Hugging Face. There can be multiple model cards, meaning multiple models are used.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a4/#2-assessments","title":"2. Assessments","text":"<ol> <li><code>assessments</code> (OPTIONAL, list[AssessmentCard]). A list of assessment cards (as defined below) or <code>!include</code>s of a yaml file containing a assessment card. This assessment card is an assessment card described in the next section. There can be multiple assessment cards, meaning multiple assessment were performed.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a4/#model_card","title":"<code>model_card</code>","text":"<p>A <code>model_card</code> contains the following information.</p> <ol> <li> <p><code>provenance</code> (OPTIONAL). In case this Model Card is generated from another source file, this field can capture the historical context of the contents of this Model Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card. Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>language</code> (OPTIONAL, list[string]). If relevant, the natural languages the model supports in ISO 639.     There can be multiple languages.</p> </li> <li> <p><code>license</code>(REQUIRED, string). Any license from the open source license list <sup>1</sup>. If the license is NOT present in the license list this field must be set to 'other' and the following two fields will be REQUIRED.</p> <ol> <li><code>license_name</code> (string). An id for the license.</li> <li><code>license_link</code> (string). A link to a file of that name inside the repo, or a URL to a remote file containing the license contents.</li> </ol> </li> <li> <p><code>tags</code> (OPTIONAL, list[string]). Tags with keywords to describe the project. There can be multiple tags.</p> </li> <li> <p><code>owners</code> (list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is set.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a4/#1-model-index","title":"1. Model Index","text":"<p>There can be multiple models. For each model the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the model.</li> <li><code>model</code> (REQUIRED, string). A URI pointing to a repository containing the model file.</li> <li> <p><code>artifacts</code> (OPTIONAL, list). A list of artifacts</p> <ol> <li><code>uri</code> (OPTIONAL, string) URI refers to a relevant model artifact</li> <li><code>content-type</code> (OPTIONAL, string) Optional type, follow the Content-Type. Recognized values are \"application/onnx\"\", to refer to an ONNX representation of the model.</li> <li><code>md5-checksum</code> (OPTIONAL, string) Optional checksum for the content of the file.</li> </ol> </li> <li> <p><code>parameters</code> (list). There can be multiple parameters. For each parameter the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the parameter, for example \"epochs\".</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the parameter, for example \"int\".</li> <li><code>value</code> (OPTIONAL, string). The value of the parameter, for example 100.</li> <li> <p><code>labels</code> (list). This field allows to store meta information about a parameter.     There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the label.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED.</li> </ol> </li> </ol> </li> <li> <p><code>results</code> (list). There can be multiple results. For each result the following fields are present.</p> <ol> <li> <p><code>task</code> (OPTIONAL, list).</p> <ol> <li><code>task_type</code> (REQUIRED, string). The task of the model, for example \"object-classification\".</li> <li><code>task_name</code> (OPTIONAL, string). A pretty name for the model tasks, for example \"Object Classification\".</li> </ol> </li> <li> <p><code>datasets</code> (list). There can be multiple datasets <sup>2</sup>. For each dataset the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the dataset, can be a dataset id from Hugging Face datasets or any other link to a repository containing the dataset<sup>3</sup>, for example \"common_voice\".</li> <li><code>name</code> (REQUIRED, string). Name pretty name for the dataset, for example \"Common Voice (French)\".</li> <li><code>split</code> (OPTIONAL, string). The split of the dataset, for example \"train\".</li> <li><code>features</code> (OPTIONAL, list[string]). List of feature names.</li> <li><code>revision</code> (OPTIONAL, string). Version of the dataset, for example 5503434ddd753f426f4b38109466949a1217c2bb.</li> </ol> </li> <li> <p><code>metrics</code> (list). There can be multiple metrics. For each metric the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). A metric-id from Hugging Face metrics<sup>4</sup>, for example accuracy.</li> <li><code>name</code> (REQUIRED, string). A descriptive name of the metric. For example \"false positive rate\" is not a descriptive name, but \"training false positive rate w.r.t class x\" is.</li> <li><code>dtype</code> (REQUIRED, string). The data type of the metric, for example <code>float</code>.</li> <li><code>value</code> (REQUIRED, string). The value of the metric.</li> <li> <p><code>labels</code> (list). This field allows to store meta information about a metric. For example, metrics can be computed for example on subgroups of specific features. For example, one can compute the accuracy for examples where the feature \"gender\" is set to \"male\". There can be multiple subgroups, which means that the metric is computed on the intersection of those subgroups. There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the feature. For example: \"gender\".</li> <li><code>type</code> (OPTIONAL, string). The type of the label. Can for example be set to \"feature\" or \"output_class\". If <code>name</code> is set, this field is REQUIRED.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature, for example <code>float</code>. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED. For example: \"male\".</li> </ol> </li> </ol> </li> <li> <p><code>measurements</code>.</p> <ol> <li> <p><code>bar_plots</code> (list). The purpose of this field is to capture bar plot like measurements, for example SHAP values. There can be multiple bar plots. For each bar plot the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of bar plot, for example \"SHAP\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name for the plot, for example \"Mean Absolute SHAP Values\".</li> <li><code>results</code> (list). The contents of the bar plot. A result represents a bar. There can be multiple results. For each result the following fields are present.<ol> <li><code>name</code> (REQUIRED, string). The name of bar.</li> <li><code>value</code> (REQUIRED, float). The value of the corresponding bar.</li> </ol> </li> </ol> </li> <li> <p><code>graph_plots</code> (list). The purpose of this field is to capture graph plot like measurements, such as partial dependence plots. There can be multiple graph plots. For each graph plot the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the graph plot, for example \"partial_dependence\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name of the graph, for example \"Partial Dependence Plot\".</li> <li><code>results</code> (list). Results contains the graph plot data. Each graph can depend on a specific output class and feature. There can be multiple results. For each result the following fields are present.<ol> <li><code>class</code> (OPTIONAL, string/int/float/bool). The output class name that the graph corresponds to. This field is not always present.</li> <li><code>feature</code> (REQUIRED, string). The feature the graph corresponds to. This is required, since all relevant graphs are dependent on features.</li> <li><code>data</code> (list)<ol> <li><code>x_value</code> (REQUIRED, float). The $x$-value of the graph.</li> <li><code>y_value</code> (REQUIRED, float). The $y$-value of the graph.</li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a4/#assessment_card","title":"<code>assessment_card</code>","text":"<p>An <code>assessment_card</code> contains the following information.</p> <ol> <li> <p><code>provenance</code> (OPTIONAL). In case this Assessment Card is generated from another source file, this field can capture the historical context of the contents of this Assessment Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card. Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>name</code> (REQUIRED, string). The name of the assessment.</p> </li> <li><code>date</code> (REQUIRED, string). The date at which the assessment is completed. Date should be given in ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li> <p><code>contents</code> (list). There can be multiple items in contents. For each item the following fields are present:</p> <ol> <li><code>question</code> (REQUIRED, string). A question.</li> <li><code>answer</code> (REQUIRED, string). An answer.</li> <li><code>remarks</code> (OPTIONAL, string). A field to put relevant discussion remarks in.</li> <li><code>authors</code>. There can be multiple names. For each name the following field is present.<ol> <li><code>name</code> (OPTIONAL, string). The name of the author of the question.</li> </ol> </li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of the answer. Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a4/#example","title":"Example","text":""},{"location":"projects/tad/reporting-standard/0.1a4/#system-card","title":"System Card","text":"<pre><code>version: {system_card_version}                          # Optional. Example: \"0.1a1\"\nprovenance:                                             # Optional.\n  git_commit_hash: {git_commit_hash}                    # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n  timestamp: {modification_timestamp}                   # Optional. Example: 2024-04-16T16:48:14Z.\n  uri: {modification_uri}                               # Optional. Example: https://github.com/MinBZK/tad-conversion-tool\n  author: {modification_author}                         # Optional. Example: John Doe\nname: {system_name}                                     # Optional. Example: \"AangifteVertrekBuitenland\"\nupl: {upl_uri}                                          # Optional. Example: https://standaarden.overheid.nl/owms/terms/AangifteVertrekBuitenland\nowners:\n- oin: {oin}                                            # Optional. Example: 00000001003214345000\n  organization: {organization_name}                     # Optional if oin is provided, Required otherwise. Example: BZK\n  name: {owner_name}                                    # Optional. Example: John Doe\n  email: {owner_email}                                  # Optional. Example: johndoe@email.com\n  role: {owner_role}                                    # Optional. Example: Data Scientist.\ndescription: {system_description}                       # Optional. Short description of the system.\nlabels:                                                 # Optional. Labels to store metadata about the system.\n- name: {label_name}                                    # Optional.\n  value: {label_value}                                  # Optional.\nstatus: {system_status}                                 # Optional. Example: \"production\".\npublication_category: {system_publication_cat}          # Optional. Example: \"impactful_algorithm\".\nbegin_date: {system_begin_date}                         # Optional. Example: 2025-1-1.\nend_date: {system_end_date}                             # Optional. Example: 2025-12-1.\ngoal_and_impact: {system_goal_and_impact}               # Optional. Goal and impact of the system.\nconsiderations: {system_considerations}                 # Optional. Considerations about the system.\nrisk_management: {system_risk_management}               # Optional. Description of risks associated with the system.\nhuman_intervention: {system_human_intervention}         # Optional. Description of human involvement in the system.\nlegal_base:\n- name: {law_name}                                      # Optional. Example: \"AVG\".\n  link: {law_uri}                                       # Optional. Example: \"https://eur-lex.europa.eu/legal-content/NL/TXT/HTML/?uri=CELEX:31995L0046\".\nused_data: {system_used_data}                           # Optional. Description of the data used by the system.\ntechnical_design: {technical_design}                    # Optional. Description of the technical design of the system.\nexternal_providers:\n- {system_external_provider}                            # Optional. Reference to used external providers.\nreferences:\n- {reference_uri}                                       # Optional. Example: URI to codebase.\n\nmodels:\n- !include {model_card_uri}                             # Optional. Example: cat_classifier_model.yaml.\n\nassessments:\n- !include {assessment_card_uri}                        # Required. Example: iama.yaml.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a4/#model-card","title":"Model Card","text":"<pre><code>provenance:                                             # Optional.\n  git_commit_hash: {git_commit_hash}                    # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n  timestamp: {modification_timestamp}                   # Optional. Example: 2024-04-16T16:48:14Z.\n  uri: {modification_uri}                               # Optional. Example: https://github.com/MinBZK/tad-conversion-tool\n  author: {modification_author}                         # Optional. Example: John Doe\nlanguage:\n  - {lang_0}                                            # Optional. Example nl.\nlicense: {license}                                      # Required. Example: Apache-2.0 or any license SPDX ID from https://opensource.org/license or \"other\".\nlicense_name: {license_name}                            # Optional if license != other, Required otherwise. Example: 'my-license-1.0'\nlicense_link: {license_link}                            # Optional if license != other, Required otherwise. Specify \"LICENSE\" or \"LICENSE.md\" to link to a file of that name inside the repo, or a URL to a remote file.\ntags:\n- {tag_0}                                               # Optional. Example: audio\n- {tag_1}                                               # Optional. Example: automatic-speech-recognition\nowners:\n- organization: {organization_name}                     # Required. Example: BZK\n  oin: {oin}                                            # Optional. Example: 00000001003214345000\n  name: {owner_name}                                    # Optional. Example: John Doe\n  email: {owner_email}                                  # Optional. Example: johndoe@email.com\n  role: {owner_role}                                    # Optional. Example: Data Scientist.\n\nmodel-index:\n- name: {model_id}                                      # Required. Example: CatClassifier.\n  model: {model_uri}                                    # Required. URI to a repository containing the model file.\n  artifacts:\n  - uri: {model_artifact_uri}                           # Optional. Example: \"https://github.com/MinBZK/poc-kijkdoos-wasm-models/raw/main/logres_iris/logreg_iris.onnx\"\n  - content-type: {model_artifact_type}                 # Optional. Example: \"application/onnx\".\n  - md5-checksum: {md5_checksum}                        # Optional. Example: \"120EA8A25E5D487BF68B5F7096440019\"\n  parameters:\n  - name: {parameter_name}                              # Optional. Example: \"epochs\".\n    dtype: {parameter_dtype}                            # Optional. Example: \"int\".\n    value: {parameter_value}                            # Optional. Example: 100.\n    labels:\n      - name: {label_name}                              # Optional. Example: \"gender\".\n        dtype: {label_type}                             # Optional. Example: \"string\".\n        value: {label_value}                            # Optional. Example: \"female\".\n  results:\n  - task:\n      type: {task_type}                                 # Required. Example: image-classification.\n      name: {task_name}                                 # Optional. Example: Image Classification.\n    datasets:\n      - type: {dataset_type}                            # Required. Example: common_voice. Link to a repository containing the dataset\n        name: {dataset_name}                            # Required. Example: \"Common Voice (French)\". A pretty name for the dataset.\n        split: {split}                                  # Optional. Example: \"train\".\n        features:\n         - {feature_name}                               # Optional. Example: \"gender\".\n        revision: {dataset_version}                     # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n    metrics:\n    - type: {metric_type}                               # Required. Example: false-positive-rate. Use metric id from https://hf.co/metrics.\n      name: {metric_name}                               # Required. Example: \"FPR wrt class 0 restricted to feature gender:0 and age:21\".\n      dtype: {metric_dtype}                             # Required. Example: \"float\".\n      value: {metric_value}                             # Required. Example: 0.75.\n      labels:\n        - name: {label_name}                            # Optional. Example: \"gender\".\n          type: {label_type}                            # Optional. Example: \"feature\".\n          dtype: {label_type}                           # Optional. Example: \"string\".\n          value: {label_value}                          # Optional. Example: \"female\".\n    measurements:\n      # Bar plots should be able to capture SHAP and Robustness Toolbox from AI Verify.\n      bar_plots:\n      - type: {measurement_type}                        # Required. Example: \"SHAP\".\n        name: {measurement_name}                        # Optional. Example: \"Mean Absolute Shap Values\".\n        results:\n        - name: {bar_name}                              # Required. The name of a bar.\n          value: {bar_value}                            # Required. The corresponding value.\n      # Graph plots should be able to capture graph based measurements such as partial dependence and accumulated local effect.\n      graph_plots:\n      - type: {measurement_type}                        # Required. Example: \"partial_dependence\".\n        name: {measurement_name}                        # Optional. Example: \"Partial Dependence Plot\".\n        # Results store the graph plot data. So far all plots are dependent on a combination of a specific class (sometimes) and feature (always).\n        # For example partial dependence plots are made for each feature and class.\n        results:\n         - class: {class_name}                          # Optional. Name of the output class the graph depends on.\n           feature: {feature_name}                      # Required. Name of the feature the graph depends on.\n           data:\n            - x_value: {x_value}                        # Required. The x value of the graph data.\n              y_value: {y_value}                        # Required. The y value of the graph data.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a4/#assessment-card","title":"Assessment Card","text":"<pre><code>provenance:                                             # Optional.\n  git_commit_hash: {git_commit_hash}                    # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n  timestamp: {modification_timestamp}                   # Optional. Example: 2024-04-16T16:48:14Z.\n  uri: {modification_uri}                               # Optional. Example: https://github.com/MinBZK/tad-conversion-tool\n  author: {modification_author}                         # Optional. Example: John Doe\nname: {assessment_name}                                 # Required. Example: IAMA.\ndate: {assessment_date}                                 # Required. Example: 25-03-2025.\ncontents:\n  - question: {question_text}                           # Required. Example: \"Question 1: ...\".\n    answer: {answer_text}                               # Required. Example: \"Answer: ...\".\n    remarks: {remarks_text}                             # Optional. Example: \"Remarks: ...\".\n    authors:                                            # Optional. Example: \"['John', 'Peter']\".\n      - name: {author_name}\n    timestamp: {timestamp}                              # Optional. Example: 2024-04-16T16:48:14Z.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a4/#schema","title":"Schema","text":"<p>JSON schema will be added when we publish the first beta version.</p>"},{"location":"projects/tad/reporting-standard/0.1a4/#changelog","title":"Changelog","text":"<ul> <li>0.1a4: adds data provenance</li> <li>0.1a3: require ISO 8601 timestamp</li> <li>0.1a2: introduces typed artifacts</li> <li>0.1a1: initial draft version of this reporting standard</li> </ul> <ol> <li> <p>Deviation from the Hugging Face specification is in the License field. Hugging Face only accepts dataset id's from Hugging Face license list while we accept any license from Open Source License List.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the <code>model_index:results:dataset</code> field. Hugging Face only accepts one dataset, while we accept a list of datasets.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the Dataset Type field. Hugging Face only accepts dataset id's from Hugging Face datasets while we also allow for any url pointing to the dataset.\u00a0\u21a9\u21a9</p> </li> <li> <p>For this extension to work relevant metrics (such as for example false positive rate) have to be added to the Hugging Face metrics, possibly this can be done in our organizational namespace.\u00a0\u21a9\u21a9</p> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a5/","title":"0.1a5","text":"<p>This document describes the Transparency of Algorithmic Decision making (TAD) Reporting Standard.</p> <p>For reproducibility, governance, auditing and sharing of algorithmic systems it is essential to have a reporting standard so that information about an algorithmic system can be shared. This reporting standard describes how information about the different phases of an algorithm's life cycle can be reported. It contains, among other things, descriptive information combined with information about the technical tests and assessments applied.</p> <p>Disclaimer</p> <p>The TAD Reporting Standard is work in progress. This means that the current standard is probably suboptimal and will change significantly in future versions.</p>"},{"location":"projects/tad/reporting-standard/0.1a5/#introduction","title":"Introduction","text":"<p>Inspired by Model Cards for Model Reporting and Papers with Code Model Index this standard almost <sup>1</sup> <sup>2</sup> <sup>3</sup> <sup>4</sup> extends the Hugging Face model card metadata specification to allow for:</p> <ol> <li>More fine-grained information on performance metrics, by extending the <code>metrics_field</code> from the Hugging Face metadata specification.</li> <li>Capturing additional measurements on fairness and bias, which can be partitioned into bar plot like measurements (such as mean absolute SHAP values) and graph plot like measurements (such as partial dependence). This is achieved by defining a new field <code>measurements</code>.</li> <li>Capturing assessments (such as IAMA and ALTAI). This is achieved by defining a new field <code>assessments</code>.</li> </ol> <p>Following Hugging Face, this proposed standard will be written in yaml.</p> <p>This standard does not contain all fields present in the Hugging Face metadata specification. The fields that are optional in the Hugging Face specification and are specific to the Hugging Face interface are omitted.</p> <p>Another difference is that we divide our implementation into three separate parts.</p> <ol> <li><code>system_card</code>, containing information about a group of ML-models which accomplish a specific task.</li> <li><code>model_card</code>, containing information about a specific data science model.</li> <li><code>assessment_card</code>, containing information about a regulatory assessment.</li> </ol> <p>Include statements</p> <p>These <code>model_card</code>s and  <code>assessment_card</code>s  can be included verbatim into a <code>system_card</code>, or referenced with an <code>!include</code> statement, allowing for minimal cards to be compact in a single file. Extensive cards can be split up for readability and maintainability. Our standard allows for the <code>!include</code> to be used anywhere.</p>"},{"location":"projects/tad/reporting-standard/0.1a5/#specification-of-the-standard","title":"Specification of the standard","text":"<p>The standard will be written in yaml. Example yaml files are given in the next section. The standard defines three cards: a <code>system_card</code>, a <code>model_card</code> and an <code>assessment_card</code>. A <code>system_card</code> contains information about an algorithmic system. It can have multiple models and each of these models should have a <code>model_card</code>. Regulatory assessments can be processed in an <code>assessment_card</code>. Note that <code>model_card</code>'s and <code>assessment_card</code>'s can be included directly into the <code>system_card</code> or can be included as separate yaml files with help of a yaml-include mechanism. For clarity the latter is preferred and is also used in the examples in the next section.</p>"},{"location":"projects/tad/reporting-standard/0.1a5/#system_card","title":"<code>system_card</code>","text":"<p>A <code>system_card</code> contains the following information.</p> <ol> <li><code>schema_version</code> (REQUIRED, string). Version of the schema used, for example \"0.1a2\".</li> <li> <p><code>provenance</code> (OPTIONAL). In case this System Card is generated from another source file, this field can capture the historical context of the contents of this System Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card. Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>name</code> (OPTIONAL, string). Name used to describe the system.</p> </li> <li><code>upl</code> (OPTIONAL, string). If this algorithm is part of a product offered by the Dutch Government,     it should contain a URI from the Uniform Product List.</li> <li> <p><code>owners</code> (list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is set.</li> </ol> </li> <li> <p><code>description</code> (OPTIONAL, string). A short description of the system.</p> </li> <li> <p><code>labels</code> (OPTIONAL, list). This fields allows to store meta information about a system. There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the label.</li> <li><code>value</code> (OPTIONAL, string). Value of the label.</li> </ol> </li> <li> <p><code>status</code> (OPTIONAL, string). The status of the system. For example the status can be \"production\".</p> </li> <li><code>publication_category</code> (OPTIONAL, enum[string]). The publication category of the algorithm should be chosen from <code>[\"high_risk\", other\"]</code>.</li> <li><code>begin_date</code> (OPTIONAL, string). The first date the system was used. Date should be given in ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>end_date</code> (OPTIONAL, string). The last date the system was used. Date should be given in ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>goal_and_impact</code> (OPTIONAL, string). The purpose of the system and the impact it has on citizens and companies.</li> <li><code>considerations</code> (OPTIONAL, string). The pro's and con's of using the system.</li> <li><code>risk_management</code> (OPTIONAL, string). Description of the risks associated with the system.</li> <li><code>human_intervention</code> (OPTIONAL, string). A description to want extend there is human involvement in the system.</li> <li><code>legal_base</code> (OPTIONAL, list). If there exists a legal base for the process the system is embedded in, this field can be filled in with the relevant laws. There can be multiple legal bases. For each legal base the following fields are present.<ol> <li><code>name</code> (OPTIONAL, string). Name of the law.</li> <li><code>link</code> (OPTIONAL, string). URI pointing towards the contents of the law.</li> </ol> </li> <li><code>used_data</code> (OPTIONAL, string). An overview of the data that is used in the system.</li> <li><code>technical_design</code> (OPTIONAL, string). Description on how the system works.</li> <li><code>external_providers</code> (OPTIONAL, list). If relevant, these fields allow to store information on external providers.  There can be multiple external providers.<ol> <li><code>name</code> (OPTIONAL, string). Name of the external provider.</li> <li><code>version</code> (OPTIONAL, string). Version of the external provider reflecting its relation to previous versions.</li> </ol> </li> <li><code>references</code> (OPTIONAL, list[string]). Additional reference URI's that point information about the system and are relevant.</li> <li><code>interaction_details</code> (OPTIONAL, list[string]). Explain how the AI system interacts with hardware or software, including other AI systems, or how the AI system can be used to interact with hardware or software.</li> <li><code>version_requirements</code> (OPTIONAL, list[string]). Describe the versions of the relevant software or firmware,  and any requirements related to version updates.</li> <li><code>deployment_variants</code> (OPTIONAL, list[string]). Description of all the forms in which the AI system is placed on the market or put into service, such as software packages embedded into hardware, downloads, or APIs.</li> <li><code>hardware_requirements</code> (OPTIONAL, list[string]). Provide a description of the hardware on which the AI system must be run.</li> <li><code>product_markings</code> (OPTIONAL, list[string]). If the AI system is a component of products, photos, or illustrations, describe the external features, markings, and internal layout of those products.</li> <li><code>user_interface</code> (OPTIONAL, list). Provide information on the user interface provided to the user responsible for its operation.<ol> <li><code>description</code> (OPTIONAL, string). A description of the provided user interface.</li> <li><code>link</code> (OPTIONAL, string). A link to the user interface can be included.</li> <li><code>snapshot</code> (OPTIONAL, string). A snapshot/screenshot of the user interface can be included with the use of a hyperlink.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a5/#1-models","title":"1. Models","text":"<ol> <li><code>models</code> (OPTIONAL, list[ModelCard]). A list of model cards (as defined below) or <code>!include</code>s of a yaml file containing a model card. This model card can for example be a model card described in the next section or a model card from Hugging Face. There can be multiple model cards, meaning multiple models are used.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a5/#2-assessments","title":"2. Assessments","text":"<ol> <li><code>assessments</code> (OPTIONAL, list[AssessmentCard]). A list of assessment cards (as defined below) or <code>!include</code>s of a yaml file containing a assessment card. This assessment card is an assessment card described in the next section. There can be multiple assessment cards, meaning multiple assessment were performed.</li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a5/#model_card","title":"<code>model_card</code>","text":"<p>A <code>model_card</code> contains the following information.</p> <ol> <li> <p><code>provenance</code> (OPTIONAL). In case this Model Card is generated from another source file, this field can capture the historical context of the contents of this Model Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card. Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>language</code> (OPTIONAL, list[string]). If relevant, the natural languages the model supports in ISO 639.     There can be multiple languages.</p> </li> <li> <p><code>license</code>(REQUIRED, string). Any license from the open source license list <sup>1</sup>. If the license is NOT present in the license list this field must be set to 'other' and the following two fields will be REQUIRED.</p> <ol> <li><code>license_name</code> (string). An id for the license.</li> <li><code>license_link</code> (string). A link to a file of that name inside the repo, or a URL to a remote file containing the license contents.</li> </ol> </li> <li> <p><code>tags</code> (OPTIONAL, list[string]). Tags with keywords to describe the project. There can be multiple tags.</p> </li> <li> <p><code>owners</code> (list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is set.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a5/#1-model-index","title":"1. Model Index","text":"<p>There can be multiple models. For each model the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the model.</li> <li><code>model</code> (REQUIRED, string). A URI pointing to a repository containing the model file.</li> <li> <p><code>artifacts</code> (OPTIONAL, list). A list of artifacts</p> <ol> <li><code>uri</code> (OPTIONAL, string) URI refers to a relevant model artifact</li> <li><code>content-type</code> (OPTIONAL, string) Optional type, follow the Content-Type. Recognized values are \"application/onnx\"\", to refer to an ONNX representation of the model.</li> <li><code>md5-checksum</code> (OPTIONAL, string) Optional checksum for the content of the file.</li> </ol> </li> <li> <p><code>parameters</code> (list). There can be multiple parameters. For each parameter the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the parameter, for example \"epochs\".</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the parameter, for example \"int\".</li> <li><code>value</code> (OPTIONAL, string). The value of the parameter, for example 100.</li> <li> <p><code>labels</code> (list). This field allows to store meta information about a parameter.     There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the label.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED.</li> </ol> </li> </ol> </li> <li> <p><code>results</code> (list). There can be multiple results. For each result the following fields are present.</p> <ol> <li> <p><code>task</code> (OPTIONAL, list).</p> <ol> <li><code>task_type</code> (REQUIRED, string). The task of the model, for example \"object-classification\".</li> <li><code>task_name</code> (OPTIONAL, string). A pretty name for the model tasks, for example \"Object Classification\".</li> </ol> </li> <li> <p><code>datasets</code> (list). There can be multiple datasets <sup>2</sup>. For each dataset the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the dataset, can be a dataset id from Hugging Face datasets or any other link to a repository containing the dataset<sup>3</sup>, for example \"common_voice\".</li> <li><code>name</code> (REQUIRED, string). Name pretty name for the dataset, for example \"Common Voice (French)\".</li> <li><code>split</code> (OPTIONAL, string). The split of the dataset, for example \"train\".</li> <li><code>features</code> (OPTIONAL, list[string]). List of feature names.</li> <li><code>revision</code> (OPTIONAL, string). Version of the dataset, for example 5503434ddd753f426f4b38109466949a1217c2bb.</li> </ol> </li> <li> <p><code>metrics</code> (list). There can be multiple metrics. For each metric the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). A metric-id from Hugging Face metrics<sup>4</sup>, for example accuracy.</li> <li><code>name</code> (REQUIRED, string). A descriptive name of the metric. For example \"false positive rate\" is not a descriptive name, but \"training false positive rate w.r.t class x\" is.</li> <li><code>dtype</code> (REQUIRED, string). The data type of the metric, for example <code>float</code>.</li> <li><code>value</code> (REQUIRED, string). The value of the metric.</li> <li> <p><code>labels</code> (list). This field allows to store meta information about a metric. For example, metrics can be computed for example on subgroups of specific features. For example, one can compute the accuracy for examples where the feature \"gender\" is set to \"male\". There can be multiple subgroups, which means that the metric is computed on the intersection of those subgroups. There can be multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the feature. For example: \"gender\".</li> <li><code>type</code> (OPTIONAL, string). The type of the label. Can for example be set to \"feature\" or \"output_class\". If <code>name</code> is set, this field is REQUIRED.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature, for example <code>float</code>. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED. For example: \"male\".</li> </ol> </li> </ol> </li> <li> <p><code>measurements</code>.</p> <ol> <li> <p><code>bar_plots</code> (list). The purpose of this field is to capture bar plot like measurements, for example SHAP values. There can be multiple bar plots. For each bar plot the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of bar plot, for example \"SHAP\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name for the plot, for example \"Mean Absolute SHAP Values\".</li> <li><code>results</code> (list). The contents of the bar plot. A result represents a bar. There can be multiple results. For each result the following fields are present.<ol> <li><code>name</code> (REQUIRED, string). The name of bar.</li> <li><code>value</code> (REQUIRED, float). The value of the corresponding bar.</li> </ol> </li> </ol> </li> <li> <p><code>graph_plots</code> (list). The purpose of this field is to capture graph plot like measurements, such as partial dependence plots. There can be multiple graph plots. For each graph plot the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the graph plot, for example \"partial_dependence\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name of the graph, for example \"Partial Dependence Plot\".</li> <li><code>results</code> (list). Results contains the graph plot data. Each graph can depend on a specific output class and feature. There can be multiple results. For each result the following fields are present.<ol> <li><code>class</code> (OPTIONAL, string/int/float/bool). The output class name that the graph corresponds to. This field is not always present.</li> <li><code>feature</code> (REQUIRED, string). The feature the graph corresponds to. This is required, since all relevant graphs are dependent on features.</li> <li><code>data</code> (list)<ol> <li><code>x_value</code> (REQUIRED, float). The $x$-value of the graph.</li> <li><code>y_value</code> (REQUIRED, float). The $y$-value of the graph.</li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a5/#assessment_card","title":"<code>assessment_card</code>","text":"<p>An <code>assessment_card</code> contains the following information.</p> <ol> <li> <p><code>provenance</code> (OPTIONAL). In case this Assessment Card is generated from another source file, this field can capture the historical context of the contents of this Assessment Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card. Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>name</code> (REQUIRED, string). The name of the assessment.</p> </li> <li><code>date</code> (REQUIRED, string). The date at which the assessment is completed. Date should be given in ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li> <p><code>contents</code> (list). There can be multiple items in contents. For each item the following fields are present:</p> <ol> <li><code>question</code> (REQUIRED, string). A question.</li> <li><code>answer</code> (REQUIRED, string). An answer.</li> <li><code>remarks</code> (OPTIONAL, string). A field to put relevant discussion remarks in.</li> <li><code>authors</code>. There can be multiple names. For each name the following field is present.<ol> <li><code>name</code> (OPTIONAL, string). The name of the author of the question.</li> </ol> </li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of the answer. Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a5/#example","title":"Example","text":""},{"location":"projects/tad/reporting-standard/0.1a5/#system-card","title":"System Card","text":"<pre><code>version: {system_card_version}                          # Optional. Example: \"0.1a1\"\nprovenance:                                             # Optional.\n  git_commit_hash: {git_commit_hash}                    # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n  timestamp: {modification_timestamp}                   # Optional. Example: 2024-04-16T16:48:14Z.\n  uri: {modification_uri}                               # Optional. Example: https://github.com/MinBZK/tad-conversion-tool\n  author: {modification_author}                         # Optional. Example: John Doe\nname: {system_name}                                     # Optional. Example: \"AangifteVertrekBuitenland\"\nupl: {upl_uri}                                          # Optional. Example: https://standaarden.overheid.nl/owms/terms/AangifteVertrekBuitenland\nowners:\n- oin: {oin}                                            # Optional. Example: 00000001003214345000\n  organization: {organization_name}                     # Optional if oin is provided, Required otherwise. Example: BZK\n  name: {owner_name}                                    # Optional. Example: John Doe\n  email: {owner_email}                                  # Optional. Example: johndoe@email.com\n  role: {owner_role}                                    # Optional. Example: Data Scientist.\ndescription: {system_description}                       # Optional. Short description of the system.\nlabels:                                                 # Optional. Labels to store metadata about the system.\n- name: {label_name}                                    # Optional.\n  value: {label_value}                                  # Optional.\nstatus: {system_status}                                 # Optional. Example: \"production\".\npublication_category: {system_publication_cat}          # Optional. Example: \"impactful_algorithm\".\nbegin_date: {system_begin_date}                         # Optional. Example: 2025-1-1.\nend_date: {system_end_date}                             # Optional. Example: 2025-12-1.\ngoal_and_impact: {system_goal_and_impact}               # Optional. Goal and impact of the system.\nconsiderations: {system_considerations}                 # Optional. Considerations about the system.\nrisk_management: {system_risk_management}               # Optional. Description of risks associated with the system.\nhuman_intervention: {system_human_intervention}         # Optional. Description of human involvement in the system.\nlegal_base:\n- name: {law_name}                                      # Optional. Example: \"AVG\".\n  link: {law_uri}                                       # Optional. Example: \"https://eur-lex.europa.eu/legal-content/NL/TXT/HTML/?uri=CELEX:31995L0046\".\nused_data: {system_used_data}                           # Optional. Description of the data used by the system.\ntechnical_design: {technical_design}                    # Optional. Description of the technical design of the system.\nexternal_providers:\n- name: {name_external_provider}                        # Optional. Reference to used external providers.\n  version: {version_external_provider}                  # Optional. Version used of the external provider.\nreferences:\n- {reference_uri}                                       # Optional. Example: URI to codebase.\ninteraction_details:\n- {system_interaction_details}                          # Optional. Example: \"GPS modules for location tracking\"\nversion_requirements:\n- {system_version_requirements}                         # Optional. Example: \"&gt;version2.1\"\ndeployment_variants:\n- {system_deployment_variants}                          # Optional. Example: \"Web Application\"\nhardware_requirements:\n- {system_hardware_requirements}                        # Optional. Example: \"8 cores, 16 threads CPU\"\nproduct_markings:\n- {system_product_markings}                             # Optional. Example: \"Model number in the info menu\"\nuser_interface:\n- description: {system_user_interface}                  # Optional. Example: \"web-based dashboard\"\n  link: {system_user_interface_uri}                     # Optional. Example: \"http://example.com/content\"\n  snapshot: {system_user_interface_snapshot_uri}        # Optional. Example: \"http://example.com/snapshot.png\"\n\nmodels:\n- !include {model_card_uri}                             # Optional. Example: cat_classifier_model.yaml.\n\nassessments:\n- !include {assessment_card_uri}                        # Required. Example: iama.yaml.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a5/#model-card","title":"Model Card","text":"<pre><code>provenance:                                             # Optional.\n  git_commit_hash: {git_commit_hash}                    # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n  timestamp: {modification_timestamp}                   # Optional. Example: 2024-04-16T16:48:14Z.\n  uri: {modification_uri}                               # Optional. Example: https://github.com/MinBZK/tad-conversion-tool\n  author: {modification_author}                         # Optional. Example: John Doe\nlanguage:\n  - {lang_0}                                            # Optional. Example nl.\nlicense: {license}                                      # Required. Example: Apache-2.0 or any license SPDX ID from https://opensource.org/license or \"other\".\nlicense_name: {license_name}                            # Optional if license != other, Required otherwise. Example: 'my-license-1.0'\nlicense_link: {license_link}                            # Optional if license != other, Required otherwise. Specify \"LICENSE\" or \"LICENSE.md\" to link to a file of that name inside the repo, or a URL to a remote file.\ntags:\n- {tag_0}                                               # Optional. Example: audio\n- {tag_1}                                               # Optional. Example: automatic-speech-recognition\nowners:\n- organization: {organization_name}                     # Required. Example: BZK\n  oin: {oin}                                            # Optional. Example: 00000001003214345000\n  name: {owner_name}                                    # Optional. Example: John Doe\n  email: {owner_email}                                  # Optional. Example: johndoe@email.com\n  role: {owner_role}                                    # Optional. Example: Data Scientist.\n\nmodel-index:\n- name: {model_id}                                      # Required. Example: CatClassifier.\n  model: {model_uri}                                    # Required. URI to a repository containing the model file.\n  artifacts:\n  - uri: {model_artifact_uri}                           # Optional. Example: \"https://github.com/MinBZK/poc-kijkdoos-wasm-models/raw/main/logres_iris/logreg_iris.onnx\"\n  - content-type: {model_artifact_type}                 # Optional. Example: \"application/onnx\".\n  - md5-checksum: {md5_checksum}                        # Optional. Example: \"120EA8A25E5D487BF68B5F7096440019\"\n  parameters:\n  - name: {parameter_name}                              # Optional. Example: \"epochs\".\n    dtype: {parameter_dtype}                            # Optional. Example: \"int\".\n    value: {parameter_value}                            # Optional. Example: 100.\n    labels:\n      - name: {label_name}                              # Optional. Example: \"gender\".\n        dtype: {label_type}                             # Optional. Example: \"string\".\n        value: {label_value}                            # Optional. Example: \"female\".\n  results:\n  - task:\n      type: {task_type}                                 # Required. Example: image-classification.\n      name: {task_name}                                 # Optional. Example: Image Classification.\n    datasets:\n      - type: {dataset_type}                            # Required. Example: common_voice. Link to a repository containing the dataset\n        name: {dataset_name}                            # Required. Example: \"Common Voice (French)\". A pretty name for the dataset.\n        split: {split}                                  # Optional. Example: \"train\".\n        features:\n         - {feature_name}                               # Optional. Example: \"gender\".\n        revision: {dataset_version}                     # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n    metrics:\n    - type: {metric_type}                               # Required. Example: false-positive-rate. Use metric id from https://hf.co/metrics.\n      name: {metric_name}                               # Required. Example: \"FPR wrt class 0 restricted to feature gender:0 and age:21\".\n      dtype: {metric_dtype}                             # Required. Example: \"float\".\n      value: {metric_value}                             # Required. Example: 0.75.\n      labels:\n        - name: {label_name}                            # Optional. Example: \"gender\".\n          type: {label_type}                            # Optional. Example: \"feature\".\n          dtype: {label_type}                           # Optional. Example: \"string\".\n          value: {label_value}                          # Optional. Example: \"female\".\n    measurements:\n      # Bar plots should be able to capture SHAP and Robustness Toolbox from AI Verify.\n      bar_plots:\n      - type: {measurement_type}                        # Required. Example: \"SHAP\".\n        name: {measurement_name}                        # Optional. Example: \"Mean Absolute Shap Values\".\n        results:\n        - name: {bar_name}                              # Required. The name of a bar.\n          value: {bar_value}                            # Required. The corresponding value.\n      # Graph plots should be able to capture graph based measurements such as partial dependence and accumulated local effect.\n      graph_plots:\n      - type: {measurement_type}                        # Required. Example: \"partial_dependence\".\n        name: {measurement_name}                        # Optional. Example: \"Partial Dependence Plot\".\n        # Results store the graph plot data. So far all plots are dependent on a combination of a specific class (sometimes) and feature (always).\n        # For example partial dependence plots are made for each feature and class.\n        results:\n         - class: {class_name}                          # Optional. Name of the output class the graph depends on.\n           feature: {feature_name}                      # Required. Name of the feature the graph depends on.\n           data:\n            - x_value: {x_value}                        # Required. The x value of the graph data.\n              y_value: {y_value}                        # Required. The y value of the graph data.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a5/#assessment-card","title":"Assessment Card","text":"<pre><code>provenance:                                             # Optional.\n  git_commit_hash: {git_commit_hash}                    # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb\n  timestamp: {modification_timestamp}                   # Optional. Example: 2024-04-16T16:48:14Z.\n  uri: {modification_uri}                               # Optional. Example: https://github.com/MinBZK/tad-conversion-tool\n  author: {modification_author}                         # Optional. Example: John Doe\nname: {assessment_name}                                 # Required. Example: IAMA.\ndate: {assessment_date}                                 # Required. Example: 25-03-2025.\ncontents:\n  - question: {question_text}                           # Required. Example: \"Question 1: ...\".\n    answer: {answer_text}                               # Required. Example: \"Answer: ...\".\n    remarks: {remarks_text}                             # Optional. Example: \"Remarks: ...\".\n    authors:                                            # Optional. Example: \"['John', 'Peter']\".\n      - name: {author_name}\n    timestamp: {timestamp}                              # Optional. Example: 2024-04-16T16:48:14Z.\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a5/#schema","title":"Schema","text":"<p>JSON schema will be added when we publish the first beta version.</p>"},{"location":"projects/tad/reporting-standard/0.1a5/#changelog","title":"Changelog","text":"<ul> <li>0.1a5: adds a general description of the technical documentation required for high-risk systems to conform to the EU AI Act.</li> <li>0.1a4: adds data provenance</li> <li>0.1a3: require ISO 8601 timestamp</li> <li>0.1a2: introduces typed artifacts</li> <li>0.1a1: initial draft version of this reporting standard</li> </ul> <ol> <li> <p>Deviation from the Hugging Face specification is in the License field. Hugging Face only accepts dataset id's from Hugging Face license list while we accept any license from Open Source License List.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the <code>model_index:results:dataset</code> field. Hugging Face only accepts one dataset, while we accept a list of datasets.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the Dataset Type field. Hugging Face only accepts dataset id's from Hugging Face datasets while we also allow for any url pointing to the dataset.\u00a0\u21a9\u21a9</p> </li> <li> <p>For this extension to work relevant metrics (such as for example false positive rate) have to be added to the Hugging Face metrics, possibly this can be done in our organizational namespace.\u00a0\u21a9\u21a9</p> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a6/","title":"0.1a6","text":"<p>This document describes the Transparency of Algorithmic Decision making (TAD) Reporting Standard.</p> <p>For reproducibility, governance, auditing and sharing of algorithmic systems it is essential to have a reporting standard so that information about an algorithmic system can be shared. This reporting standard describes how information about the different phases of an algorithm's life cycle can be reported. It contains, among other things, descriptive information combined with information about the technical tests and assessments applied.</p> <p>Disclaimer</p> <p>The TAD Reporting Standard is work in progress. This means that the current standard is probably suboptimal and will change significantly in future versions.</p>"},{"location":"projects/tad/reporting-standard/0.1a6/#introduction","title":"Introduction","text":"<p>Inspired by Model Cards for Model Reporting and Papers with Code Model Index this standard almost<sup>1</sup> <sup>2</sup> <sup>3</sup> <sup>4</sup> extends the Hugging Face model card metadata specification to allow for:</p> <ol> <li>More fine-grained information on performance metrics, by extending the <code>metrics_field</code> from the Hugging    Face metadata specification.</li> <li>Capturing additional measurements on fairness and bias, which can be partitioned into bar plot like    measurements (such as mean absolute SHAP values) and graph plot like measurements (such as partial dependence). This    is achieved by defining a new field <code>measurements</code>.</li> <li>Capturing assessments (such as    IAMA    and ALTAI).    This is achieved by defining a new field <code>assessments</code>.</li> </ol> <p>Following Hugging Face, this proposed standard will be written in YAML.</p> <p>This standard does not contain all fields present in the Hugging Face metadata specification. The fields that are optional in the Hugging Face specification and are specific to the Hugging Face interface are omitted.</p> <p>Another difference is that we divide our implementation into three separate parts.</p> <ol> <li><code>system_card</code>, containing information about a group of ML-models which accomplish a specific task.</li> <li><code>model_card</code>, containing information about a specific data science model.</li> <li><code>assessment_card</code>, containing information about a regulatory assessment.</li> </ol> <p>Include statements</p> <p>These <code>model_card</code>s and  <code>assessment_card</code>s  can be included verbatim into a <code>system_card</code>, or referenced with an <code>!include</code> statement, allowing for minimal cards to be compact in a single file. Extensive cards can be split up for readability and maintainability. Our standard allows for the <code>!include</code> to be used anywhere.</p>"},{"location":"projects/tad/reporting-standard/0.1a6/#specification-of-the-standard","title":"Specification of the standard","text":"<p>The standard will be written in YAML. Example YAML files are given in the next section. The standard defines three cards: a <code>system_card</code>, a <code>model_card</code> and an <code>assessment_card</code>. A <code>system_card</code> contains information about an algorithmic system. It can have multiple models and each of these models should have a <code>model_card</code>. Regulatory assessments can be processed in an <code>assessment_card</code>. Note that <code>model_card</code>'s and <code>assessment_card</code>'s can be included directly into the <code>system_card</code> or can be included as separate YAML files with help of a YAML-include mechanism. For clarity the latter is preferred and is also used in the examples in the next section.</p>"},{"location":"projects/tad/reporting-standard/0.1a6/#system_card","title":"<code>system_card</code>","text":"<p>A <code>system_card</code> contains the following information.</p> <ol> <li><code>schema_version</code> (REQUIRED, string). Version of the schema used, for example \"0.1a2\".</li> <li> <p><code>provenance</code> (OPTIONAL). In case this System Card is generated from another source file, this field can capture the    historical context of the contents of this System Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used     to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.     Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in     ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>name</code> (OPTIONAL, string). Name used to describe the system.</p> </li> <li><code>upl</code> (OPTIONAL, string). If this algorithm is part of a product offered by the Dutch Government,     it should contain a URI from the     Uniform Product List.</li> <li> <p><code>owners</code> (OPTIONAL, list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the    Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this    field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is    set.</li> </ol> </li> <li> <p><code>description</code> (OPTIONAL, string). A short description of the system.</p> </li> <li> <p><code>labels</code> (OPTIONAL, list). This fields allows to store meta information about a system. There can be multiple labels.    For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the label.</li> <li><code>value</code> (OPTIONAL, string). Value of the label.</li> </ol> </li> <li> <p><code>status</code> (OPTIONAL, string). The status of the system. For example the status can be \"production\".</p> </li> <li><code>publication_category</code> (OPTIONAL, enum[string]). The publication category of the algorithm should be chosen from    <code>[\"high_risk\", other\"]</code>.</li> <li><code>begin_date</code> (OPTIONAL, string). The first date the system was used. Date should be given in     ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>end_date</code> (OPTIONAL, string). The last date the system was used. Date should be given in     ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>goal_and_impact</code> (OPTIONAL, string). The purpose of the system and the impact it has on citizens and companies.</li> <li><code>considerations</code> (OPTIONAL, string). The pro's and con's of using the system.</li> <li><code>risk_management</code> (OPTIONAL, string). Description of the risks associated with the system.</li> <li><code>human_intervention</code> (OPTIONAL, string). A description to want extend there is human involvement in the system.</li> <li> <p><code>legal_base</code> (OPTIONAL, list). If there exists a legal base for the process the system is embedded in, this field     can be filled in with the relevant laws. There can be multiple legal bases. For each legal base the following fields     are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the law.</li> <li><code>link</code> (OPTIONAL, string). URI pointing towards the contents of the law.</li> </ol> </li> <li> <p><code>used_data</code> (OPTIONAL, string). An overview of the data that is used in the system.</p> </li> <li><code>technical_design</code> (OPTIONAL, string). Description on how the system works.</li> <li> <p><code>external_providers</code> (OPTIONAL, list). If relevant, these fields allow to store information on external providers.     There can be multiple external providers.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the external provider.</li> <li><code>version</code> (OPTIONAL, string). Version of the external provider reflecting its relation to previous versions.</li> </ol> </li> <li> <p><code>references</code> (OPTIONAL, list[string]). Additional reference URI's that point information about the system and are     relevant.</p> </li> <li><code>interaction_details</code> (OPTIONAL, list[string]). Explain how the AI system interacts with hardware or software,     including other AI systems, or how the AI system can be used to interact with hardware or software.</li> <li><code>version_requirements</code> (OPTIONAL, list[string]). Describe the versions of the relevant software or firmware, and any     requirements related to version updates.</li> <li><code>deployment_variants</code> (OPTIONAL, list[string]). Description of all the forms in which the AI system is placed on the     market or put into service, such as software packages embedded into hardware, downloads, or APIs.</li> <li><code>hardware_requirements</code> (OPTIONAL, list[string]). Provide a description of the hardware on which the AI system must     be run.</li> <li><code>product_markings</code> (OPTIONAL, list[string]). If the AI system is a component of products, photos, or illustrations,     describe the external features, markings, and internal layout of those products.</li> <li> <p><code>user_interface</code> (OPTIONAL, list). Provide information on the user interface provided to the user responsible for     its operation.</p> <ol> <li><code>description</code> (OPTIONAL, string). A description of the provided user interface.</li> <li><code>link</code> (OPTIONAL, string). A link to the user interface can be included.</li> <li><code>snapshot</code> (OPTIONAL, string). A snapshot/screenshot of the user interface can be included with the use of a     hyperlink.</li> </ol> </li> <li> <p><code>models</code> (OPTIONAL, list[ModelCard]). A list of model cards (as defined below) or <code>!include</code>s of a YAML file     containing a model card. This model card can for example be a model card described in the next section or a model     card from Hugging Face. There can be multiple model cards, meaning multiple models are used.</p> </li> <li> <p><code>assessments</code> (OPTIONAL, list[AssessmentCard]). A list of assessment cards (as defined below) or <code>!include</code>s of a     YAML file containing a assessment card. This assessment card is an assessment card described in the next section.     There can be multiple assessment cards, meaning multiple assessment were performed.</p> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a6/#model_card","title":"<code>model_card</code>","text":"<p>A <code>model_card</code> contains the following information.</p> <ol> <li> <p><code>provenance</code> (OPTIONAL). In case this Model Card is generated from another source file, this field can capture the    historical context of the contents of this Model Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used     to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.    Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in    ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>language</code> (OPTIONAL, list[string]). If relevant, the natural languages the model supports in    ISO 639. There can be multiple languages.</p> </li> <li> <p><code>license</code> (REQUIRED).</p> <ol> <li><code>license_name</code> (REQUIRED, string). Any license from the    open source license list<sup>1</sup>. If the license is NOT present in the license list    this field must be set to 'other' and the following two fields will be REQUIRED.</li> <li><code>license_link</code> (OPTIONAL, string). A link to a file of that name inside the repo, or a URL to a remote file    containing the license contents.</li> </ol> </li> <li> <p><code>tags</code> (OPTIONAL, list[string]). Tags with keywords to describe the project. There can be multiple tags.</p> </li> <li> <p><code>owners</code> (OPTIONAL, list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the    Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this    field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is     set.</li> </ol> </li> <li> <p><code>model_index</code> (REQUIRED, list). There can be multiple models. For each model the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the model.</li> <li><code>model</code> (REQUIRED, string). A URI pointing to a repository containing the model file.</li> <li> <p><code>artifacts</code> (OPTIONAL, list). A list of artifacts</p> <ol> <li><code>uri</code> (OPTIONAL, string) URI refers to a relevant model artifact</li> <li><code>content-type</code> (OPTIONAL, string) Optional type, follow the    Content-Type. Recognized values are    \"application/onnx\"\", to refer to an ONNX representation of the model.</li> <li><code>md5-checksum</code> (OPTIONAL, string) Optional checksum for the content of the file.</li> </ol> </li> <li> <p><code>parameters</code> (OPTIONAL, list). There can be multiple parameters. For each parameter the following fields are    present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the parameter, for example \"epochs\".</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the parameter, for example \"int\".</li> <li><code>value</code> (OPTIONAL, string). The value of the parameter, for example 100.</li> <li> <p><code>labels</code> (OPTIONAL, list). This field allows to store meta information about a parameter. There can be    multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the label.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED.</li> </ol> </li> </ol> </li> <li> <p><code>results</code> (OPTIONAL, list). There can be multiple results. For each result the following fields are present.</p> <ol> <li> <p><code>task</code> (OPTIONAL, list).</p> <ol> <li><code>task_type</code> (REQUIRED, string). The task of the model, for example \"object-classification\".</li> <li><code>task_name</code> (OPTIONAL, string). A pretty name for the model tasks, for example \"Object Classification\".</li> </ol> </li> <li> <p><code>datasets</code> (OPTIONAL, list). There can be multiple datasets <sup>2</sup>. For each dataset the following fields are    present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the dataset, can be a dataset id from    Hugging Face datasets or any other link to a repository containing the    dataset<sup>3</sup>, for example \"common_voice\".</li> <li><code>name</code> (REQUIRED, string). Name pretty name for the dataset, for example \"Common Voice (French)\".</li> <li><code>split</code> (OPTIONAL, string). The split of the dataset, for example \"train\".</li> <li><code>features</code> (OPTIONAL, list[string]). List of feature names.</li> <li><code>revision</code> (OPTIONAL, string). Version of the dataset, for example    \"5503434ddd753f426f4b38109466949a1217c2bb\".</li> </ol> </li> <li> <p><code>metrics</code> (OPTIONAL, list). There can be multiple metrics. For each metric the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). A metric-id from Hugging Face metrics<sup>4</sup>, for    example accuracy.</li> <li><code>name</code> (REQUIRED, string). A descriptive name of the metric. For example \"false positive rate\" is not a    descriptive name, but \"training false positive rate w.r.t class x\" is.</li> <li><code>dtype</code> (REQUIRED, string). The data type of the metric, for example <code>float</code>.</li> <li><code>value</code> (REQUIRED, string). The value of the metric.</li> <li> <p><code>labels</code> (OPTIONAL, list). This field allows to store meta information about a metric. For example,    metrics can be computed for example on subgroups of specific features. For example, one can compute the    accuracy for examples where the feature \"gender\" is set to \"male\". There can be multiple subgroups, which    means that the metric is computed on the intersection of those subgroups. There can be multiple labels.    For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the feature. For example: \"gender\".</li> <li><code>type</code> (OPTIONAL, string). The type of the label. Can for example be set to \"feature\" or    \"output_class\". If <code>name</code> is set, this field is REQUIRED.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature, for example <code>float</code>. If <code>name</code> is set, this    field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED. For    example: \"male\".</li> </ol> </li> </ol> </li> <li> <p><code>measurements</code>.</p> <ol> <li> <p><code>bar_plots</code> (OPTIONAL, list). The purpose of this field is to capture bar plot like measurements, for     example SHAP values. There can be multiple bar plots. For each bar plot the following fields are     present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of bar plot, for example \"SHAP\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name for the plot, for example \"Mean Absolute SHAP Values\".</li> <li> <p><code>results</code> (REQUIRED, list). The contents of the bar plot. A result represents a bar. There can be    multiple results. For each result the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of bar.</li> <li><code>value</code> (REQUIRED, float). The value of the corresponding bar.</li> </ol> </li> </ol> </li> <li> <p><code>graph_plots</code> (OPTIONAL, list). The purpose of this field is to capture graph plot like measurements,    such as partial dependence plots. There can be multiple graph plots. For each graph plot the following    fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the graph plot, for example \"partial_dependence\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name of the graph, for example \"Partial Dependence Plot\".</li> <li> <p><code>results</code> (REQUIRED, list). Results contains the graph plot data. Each graph can depend on a specific    output class and feature. There can be multiple results. For each result the following fields are    present.</p> <ol> <li><code>class</code> (OPTIONAL, string/int/float/bool). The output class name that the graph corresponds to.    This field is not always present.</li> <li><code>feature</code> (REQUIRED, string). The feature the graph corresponds to. This is required, since all    relevant graphs are dependent on features.</li> <li> <p><code>data</code> (REQUIRED, list)</p> <ol> <li><code>x_value</code> (REQUIRED, float). The $x$-value of the graph.</li> <li><code>y_value</code> (REQUIRED, float). The $y$-value of the graph.</li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a6/#assessment_card","title":"<code>assessment_card</code>","text":"<p>An <code>assessment_card</code> contains the following information.</p> <ol> <li> <p><code>provenance</code> (OPTIONAL). In case this Assessment Card is generated from another source file, this field can capture    the historical context of the contents of this Assessment Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used    to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.    Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in    ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>name</code> (REQUIRED, string). The name of the assessment.</p> </li> <li><code>date</code> (REQUIRED, string). The date at which the assessment is completed. Date should be given in    ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li> <p><code>contents</code> (REQUIRED, list). There can be multiple items in contents. For each item the following fields are present:</p> <ol> <li><code>question</code> (REQUIRED, string). A question.</li> <li><code>answer</code> (REQUIRED, string). An answer.</li> <li><code>remarks</code> (OPTIONAL, string). A field to put relevant discussion remarks in.</li> <li> <p><code>authors</code> (OPTIONAL, list). There can be multiple names. For each name the following field is present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the author of the question.</li> </ol> </li> <li> <p><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of the answer. Timestamp should be     given, preferably in UTC (represented as <code>Z</code>), in     ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</p> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/0.1a6/#example","title":"Example","text":""},{"location":"projects/tad/reporting-standard/0.1a6/#system-card","title":"System Card","text":"<pre><code>version: {system_card_version}\nprovenance:\n  git_commit_hash: {git_commit_hash}\n  timestamp: {modification_timestamp}\n  uri: {modification_uri}\n  author: {modification_author}\nname: {system_name}\nupl: {upl_uri}\nowners:\n  - oin: {oin}\n    organization: {organization_name}\n    name: {owner_name}\n    email: {owner_email}\n    role: {owner_role}\ndescription: {system_description}\nlabels:\n  - name: {label_name}\n    value: {label_value}\nstatus: {system_status}\npublication_category: {system_publication_cat}\nbegin_date: {system_begin_date}\nend_date: {system_end_date}\ngoal_and_impact: {system_goal_and_impact}\nconsiderations: {system_considerations}\nrisk_management: {system_risk_management}\nhuman_intervention: {system_human_intervention}\nlegal_base:\n  - name: {law_name}\n    link: {law_uri}\nused_data: {system_used_data}\ntechnical_design: {technical_design}\nexternal_providers:\n  - name: {name_external_provider}\n    version: {version_external_provider}\nreferences:\n  - {reference_uri}\ninteraction_details:\n  - {system_interaction_details}\nversion_requirements:\n  - {system_version_requirements}\ndeployment_variants:\n  - {system_deployment_variants}\nhardware_requirements:\n  - {system_hardware_requirements}\nproduct_markings:\n  - {system_product_markings}\nuser_interface:\n  - description: {system_user_interface}\n    link: {system_user_interface_uri}\n    snapshot: {system_user_interface_snapshot_uri}\n\nmodels:\n  - !include {model_card_uri}\n\nassessments:\n  - !include {assessment_card_uri}\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a6/#model-card","title":"Model Card","text":"<pre><code>provenance:\n  git_commit_hash: {git_commit_hash}\n  timestamp: {modification_timestamp}\n  uri: {modification_uri}\n  author: {modification_author}\nlanguage:\n  - {lang_0}\nlicense:\n  license_name: {license_name}\n  license_link: {license_uri}\ntags:\n  - {tag_0}\nowners:\n  - oin: {oin}\n    organization: {organization_name}\n    name: {owner_name}\n    email: {owner_email}\n    role: {owner_role}\n\nmodel-index:\n  - name: {model_id}\n    model: {model_uri}\n    artifacts:\n      - uri: {model_artifact_uri}\n      - content-type: {model_artifact_type}\n      - md5-checksum: {md5_checksum}\n    parameters:\n      - name: {parameter_name}\n        dtype: {parameter_dtype}\n        value: {parameter_value}\n        labels:\n          - name: {label_name}\n            dtype: {label_type}\n            value: {label_value}\n    results:\n      - task:\n          - type: {task_type}\n            name: {task_name}\n        datasets:\n          - type: {dataset_type}\n            name: {dataset_name}\n            split: {split}\n            features:\n              - {feature_name}\n            revision: {dataset_version}\n        metrics:\n          - type: {metric_type}\n            name: {metric_name}\n            dtype: {metric_dtype}\n            value: {metric_value}\n            labels:\n              - name: {label_name}\n                type: {label_type}\n                dtype: {label_type}\n                value: {label_value}\n        measurements:\n          bar_plots:\n            - type: {measurement_type}\n              name: {measurement_name}\n              results:\n                - name: {bar_name}\n                  value: {bar_value}\n          graph_plots:\n            - type: {measurement_type}\n              name: {measurement_name}\n              results:\n                - class: {class_name}\n                  feature: {feature_name}\n                  data:\n                    - x_value: {x_value}\n                      y_value: {y_value}\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a6/#assessment-card","title":"Assessment Card","text":"<pre><code>provenance:\n  git_commit_hash: {git_commit_hash}\n  timestamp: {modification_timestamp}\n  uri: {modification_uri}\n  author: {modification_author}\nname: {assessment_name}\ndate: {assessment_date}\ncontents:\n  - question: {question_text}\n    answer: {answer_text}\n    remarks: {remarks_text}\n    authors:\n      - name: {author_name}\n    timestamp: {timestamp}\n</code></pre>"},{"location":"projects/tad/reporting-standard/0.1a6/#schema","title":"Schema","text":"<p>JSON schema will be added when we publish the first beta version.</p>"},{"location":"projects/tad/reporting-standard/0.1a6/#changelog","title":"Changelog","text":"<ul> <li>0.1a6:<ul> <li>fix mismatches between description and examples</li> <li>format YAML examples and Markdown formatting</li> </ul> </li> <li>0.1a5: adds a general description of the technical documentation required for high-risk systems to conform to the EU   AI Act.</li> <li>0.1a4: adds data provenance</li> <li>0.1a3: require ISO 8601 timestamp</li> <li>0.1a2: introduces typed artifacts</li> <li>0.1a1: initial draft version of this reporting standard</li> </ul> <ol> <li> <p>Deviation from the Hugging Face specification is in the License field. Hugging Face only accepts dataset id's   from Hugging Face license list while we accept any   license from Open Source License List.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the <code>model_index:results:dataset</code> field. Hugging Face only   accepts one dataset, while we accept a list of datasets.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the Dataset Type field. Hugging Face only accepts dataset id's   from Hugging Face datasets while we also allow for any url pointing to the dataset.\u00a0\u21a9\u21a9</p> </li> <li> <p>For this extension to work relevant metrics (such as for example false positive rate) have to be added to the   Hugging Face metrics, possibly this can be done in our organizational namespace.\u00a0\u21a9\u21a9</p> </li> </ol>"},{"location":"projects/tad/reporting-standard/latest/","title":"0.1a7 (Latest)","text":"<p>This document describes the Transparency of Algorithmic Decision making (TAD) Reporting Standard.</p> <p>For reproducibility, governance, auditing and sharing of algorithmic systems it is essential to have a reporting standard so that information about an algorithmic system can be shared. This reporting standard describes how information about the different phases of an algorithm's life cycle can be reported. It contains, among other things, descriptive information combined with information about the technical tests and assessments applied.</p> <p>Disclaimer</p> <p>The TAD Reporting Standard is work in progress. This means that the current standard is probably suboptimal and will change significantly in future versions.</p>"},{"location":"projects/tad/reporting-standard/latest/#introduction","title":"Introduction","text":"<p>Inspired by Model Cards for Model Reporting and Papers with Code Model Index this standard almost<sup>1</sup> <sup>2</sup> <sup>3</sup> <sup>4</sup> extends the Hugging Face model card metadata specification to allow for:</p> <ol> <li>More fine-grained information on performance metrics, by extending the <code>metrics_field</code> from the Hugging    Face metadata specification.</li> <li>Capturing additional measurements on fairness and bias, which can be partitioned into bar plot like    measurements (such as mean absolute SHAP values) and graph plot like measurements (such as partial dependence). This    is achieved by defining a new field <code>measurements</code>.</li> <li>Capturing assessments (such as    IAMA    and ALTAI).    This is achieved by defining a new field <code>assessments</code>.</li> </ol> <p>Following Hugging Face, this proposed standard will be written in YAML.</p> <p>This standard does not contain all fields present in the Hugging Face metadata specification. The fields that are optional in the Hugging Face specification and are specific to the Hugging Face interface are omitted.</p> <p>Another difference is that we divide our implementation into three separate parts.</p> <ol> <li><code>system_card</code>, containing information about a group of ML-models which accomplish a specific task.</li> <li><code>model_card</code>, containing information about a specific data science model.</li> <li><code>assessment_card</code>, containing information about a regulatory assessment.</li> </ol> <p>Include statements</p> <p>These <code>model_card</code>s and  <code>assessment_card</code>s  can be included verbatim into a <code>system_card</code>, or referenced with an <code>!include</code> statement, allowing for minimal cards to be compact in a single file. Extensive cards can be split up for readability and maintainability. Our standard allows for the <code>!include</code> to be used anywhere.</p>"},{"location":"projects/tad/reporting-standard/latest/#specification-of-the-standard","title":"Specification of the standard","text":"<p>The standard will be written in YAML. Example YAML files are given in the next section. The standard defines three cards: a <code>system_card</code>, a <code>model_card</code> and an <code>assessment_card</code>. A <code>system_card</code> contains information about an algorithmic system. It can have multiple models and each of these models should have a <code>model_card</code>. Regulatory assessments can be processed in an <code>assessment_card</code>. Note that <code>model_card</code>'s and <code>assessment_card</code>'s can be included directly into the <code>system_card</code> or can be included as separate YAML files with help of a YAML-include mechanism. For clarity the latter is preferred and is also used in the examples in the next section.</p>"},{"location":"projects/tad/reporting-standard/latest/#system_card","title":"<code>system_card</code>","text":"<p>A <code>system_card</code> contains the following information.</p> <ol> <li><code>schema_version</code> (REQUIRED, string). Version of the schema used, for example \"0.1a2\".</li> <li> <p><code>provenance</code> (OPTIONAL). In case this System Card is generated from another source file, this field can capture the    historical context of the contents of this System Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used     to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.     Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in     ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>name</code> (OPTIONAL, string). Name used to describe the system.</p> </li> <li><code>upl</code> (OPTIONAL, string). If this algorithm is part of a product offered by the Dutch Government,     it should contain a URI from the     Uniform Product List.</li> <li> <p><code>owners</code> (OPTIONAL, list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the    Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this    field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is    set.</li> </ol> </li> <li> <p><code>description</code> (OPTIONAL, string). A short description of the system.</p> </li> <li> <p><code>labels</code> (OPTIONAL, list). This fields allows to store meta information about a system. There can be multiple labels.    For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the label.</li> <li><code>value</code> (OPTIONAL, string). Value of the label.</li> </ol> </li> <li> <p><code>status</code> (OPTIONAL, string). The status of the system. For example the status can be \"production\".</p> </li> <li><code>publication_category</code> (OPTIONAL, enum[string]). The publication category of the algorithm should be chosen from    <code>[\"high_risk\", other\"]</code>.</li> <li><code>begin_date</code> (OPTIONAL, string). The first date the system was used. Date should be given in     ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>end_date</code> (OPTIONAL, string). The last date the system was used. Date should be given in     ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li><code>goal_and_impact</code> (OPTIONAL, string). The purpose of the system and the impact it has on citizens and companies.</li> <li><code>considerations</code> (OPTIONAL, string). The pro's and con's of using the system.</li> <li><code>risk_management</code> (OPTIONAL, string). Description of the risks associated with the system.</li> <li><code>human_intervention</code> (OPTIONAL, string). A description to want extend there is human involvement in the system.</li> <li> <p><code>legal_base</code> (OPTIONAL, list). If there exists a legal base for the process the system is embedded in, this field     can be filled in with the relevant laws. There can be multiple legal bases. For each legal base the following fields     are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the law.</li> <li><code>link</code> (OPTIONAL, string). URI pointing towards the contents of the law.</li> </ol> </li> <li> <p><code>used_data</code> (OPTIONAL, string). An overview of the data that is used in the system.</p> </li> <li><code>technical_design</code> (OPTIONAL, string). Description on how the system works.</li> <li> <p><code>external_providers</code> (OPTIONAL, list). If relevant, these fields allow to store information on external providers.     There can be multiple external providers.</p> <ol> <li><code>name</code> (OPTIONAL, string). Name of the external provider.</li> <li><code>version</code> (OPTIONAL, string). Version of the external provider reflecting its relation to previous versions.</li> </ol> </li> <li> <p><code>references</code> (OPTIONAL, list[string]). Additional reference URI's that point information about the system and are     relevant.</p> </li> <li><code>interaction_details</code> (OPTIONAL, list[string]). Explain how the AI system interacts with hardware or software,     including other AI systems, or how the AI system can be used to interact with hardware or software.</li> <li><code>version_requirements</code> (OPTIONAL, list[string]). Describe the versions of the relevant software or firmware, and any     requirements related to version updates.</li> <li><code>deployment_variants</code> (OPTIONAL, list[string]). Description of all the forms in which the AI system is placed on the     market or put into service, such as software packages embedded into hardware, downloads, or APIs.</li> <li><code>hardware_requirements</code> (OPTIONAL, list[string]). Provide a description of the hardware on which the AI system must     be run.</li> <li><code>product_markings</code> (OPTIONAL, list[string]). If the AI system is a component of products, photos, or illustrations,     describe the external features, markings, and internal layout of those products.</li> <li> <p><code>user_interface</code> (OPTIONAL, list). Provide information on the user interface provided to the user responsible for     its operation.</p> <ol> <li><code>description</code> (OPTIONAL, string). A description of the provided user interface.</li> <li><code>link</code> (OPTIONAL, string). A link to the user interface can be included.</li> <li><code>snapshot</code> (OPTIONAL, string). A snapshot/screenshot of the user interface can be included with the use of a     hyperlink.</li> </ol> </li> <li> <p><code>models</code> (OPTIONAL, list[ModelCard]). A list of model cards (as defined below) or <code>!include</code>s of a YAML file     containing a model card. This model card can for example be a model card described in the next section or a model     card from Hugging Face. There can be multiple model cards, meaning multiple models are used.</p> </li> <li> <p><code>assessments</code> (OPTIONAL, list[AssessmentCard]). A list of assessment cards (as defined below) or <code>!include</code>s of a     YAML file containing a assessment card. This assessment card is an assessment card described in the next section.     There can be multiple assessment cards, meaning multiple assessment were performed.</p> </li> </ol>"},{"location":"projects/tad/reporting-standard/latest/#model_card","title":"<code>model_card</code>","text":"<p>A <code>model_card</code> contains the following information.</p> <ol> <li> <p><code>provenance</code> (OPTIONAL). In case this Model Card is generated from another source file, this field can capture the    historical context of the contents of this Model Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used     to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.    Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in    ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>language</code> (OPTIONAL, list[string]). If relevant, the natural languages the model supports in    ISO 639. There can be multiple languages.</p> </li> <li> <p><code>license</code> (REQUIRED).</p> <ol> <li><code>license_name</code> (REQUIRED, string). Any license from the    open source license list<sup>1</sup>. If the license is NOT present in the license list    this field must be set to 'other' and the following two fields will be REQUIRED.</li> <li><code>license_link</code> (OPTIONAL, string). A link to a file of that name inside the repo, or a URL to a remote file    containing the license contents.</li> </ol> </li> <li> <p><code>tags</code> (OPTIONAL, list[string]). Tags with keywords to describe the project. There can be multiple tags.</p> </li> <li> <p><code>owners</code> (OPTIONAL, list). There can be multiple owners. For each owner the following fields are present.</p> <ol> <li><code>oin</code> (OPTIONAL, string). If applicable the    Organisatie-identificatienummer (OIN).</li> <li><code>organization</code> (OPTIONAL, string). Name of the organization that owns the model. If <code>ion</code> is NOT provided this    field is REQUIRED.</li> <li><code>name</code> (OPTIONAL, string). Name of a contact person within the organization.</li> <li><code>email</code> (OPTIONAL, string). Email address of the contact person or organization.</li> <li><code>role</code> (OPTIONAL, string). Role of the contact person. This field should only be set when the <code>name</code> field is     set.</li> </ol> </li> <li> <p><code>model_index</code> (REQUIRED, list). There can be multiple models. For each model the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the model.</li> <li><code>model</code> (REQUIRED, string). A URI pointing to a repository containing the model file.</li> <li> <p><code>artifacts</code> (OPTIONAL, list). A list of artifacts</p> <ol> <li><code>uri</code> (OPTIONAL, string) URI refers to a relevant model artifact</li> <li><code>content-type</code> (OPTIONAL, string) Optional type, follow the    Content-Type. Recognized values are    \"application/onnx\"\", to refer to an ONNX representation of the model.</li> <li><code>md5-checksum</code> (OPTIONAL, string) Optional checksum for the content of the file.</li> </ol> </li> <li> <p><code>parameters</code> (OPTIONAL, list). There can be multiple parameters. For each parameter the following fields are    present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of the parameter, for example \"epochs\".</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the parameter, for example \"int\".</li> <li><code>value</code> (OPTIONAL, string). The value of the parameter, for example 100.</li> <li> <p><code>labels</code> (OPTIONAL, list). This field allows to store meta information about a parameter. There can be    multiple labels. For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the label.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature. If <code>name</code> is set, this field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED.</li> </ol> </li> </ol> </li> <li> <p><code>results</code> (OPTIONAL, list). There can be multiple results. For each result the following fields are present.</p> <ol> <li> <p><code>task</code> (OPTIONAL, list).</p> <ol> <li><code>task_type</code> (REQUIRED, string). The task of the model, for example \"object-classification\".</li> <li><code>task_name</code> (OPTIONAL, string). A pretty name for the model tasks, for example \"Object Classification\".</li> </ol> </li> <li> <p><code>datasets</code> (OPTIONAL, list). There can be multiple datasets <sup>2</sup>. For each dataset the following fields are    present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the dataset, can be a dataset id from    Hugging Face datasets or any other link to a repository containing the    dataset<sup>3</sup>, for example \"common_voice\".</li> <li><code>name</code> (REQUIRED, string). Name pretty name for the dataset, for example \"Common Voice (French)\".</li> <li><code>split</code> (OPTIONAL, string). The split of the dataset, for example \"train\".</li> <li><code>features</code> (OPTIONAL, list[string]). List of feature names.</li> <li><code>revision</code> (OPTIONAL, string). Version of the dataset, for example    \"5503434ddd753f426f4b38109466949a1217c2bb\".</li> </ol> </li> <li> <p><code>metrics</code> (OPTIONAL, list). There can be multiple metrics. For each metric the following fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). A metric-id from Hugging Face metrics<sup>4</sup>, for    example accuracy.</li> <li><code>name</code> (REQUIRED, string). A descriptive name of the metric. For example \"false positive rate\" is not a    descriptive name, but \"training false positive rate w.r.t class x\" is.</li> <li><code>dtype</code> (REQUIRED, string). The data type of the metric, for example <code>float</code>.</li> <li><code>value</code> (REQUIRED, string). The value of the metric.</li> <li> <p><code>labels</code> (OPTIONAL, list). This field allows to store meta information about a metric. For example,    metrics can be computed for example on subgroups of specific features. For example, one can compute the    accuracy for examples where the feature \"gender\" is set to \"male\". There can be multiple subgroups, which    means that the metric is computed on the intersection of those subgroups. There can be multiple labels.    For each label the following fields are present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the feature. For example: \"gender\".</li> <li><code>type</code> (OPTIONAL, string). The type of the label. Can for example be set to \"feature\" or    \"output_class\". If <code>name</code> is set, this field is REQUIRED.</li> <li><code>dtype</code> (OPTIONAL, string). The datatype of the feature, for example <code>float</code>. If <code>name</code> is set, this    field is REQUIRED.</li> <li><code>value</code> (OPTIONAL, string). The value of the feature. If <code>name</code> is set, this field is REQUIRED. For    example: \"male\".</li> </ol> </li> </ol> </li> <li> <p><code>measurements</code>.</p> <ol> <li> <p><code>bar_plots</code> (OPTIONAL, list). The purpose of this field is to capture bar plot like measurements, for     example SHAP values. There can be multiple bar plots. For each bar plot the following fields are     present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of bar plot, for example \"SHAP\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name for the plot, for example \"Mean Absolute SHAP Values\".</li> <li> <p><code>results</code> (REQUIRED, list). The contents of the bar plot. A result represents a bar. There can be    multiple results. For each result the following fields are present.</p> <ol> <li><code>name</code> (REQUIRED, string). The name of bar.</li> <li><code>value</code> (REQUIRED, float). The value of the corresponding bar.</li> </ol> </li> </ol> </li> <li> <p><code>graph_plots</code> (OPTIONAL, list). The purpose of this field is to capture graph plot like measurements,    such as partial dependence plots. There can be multiple graph plots. For each graph plot the following    fields are present.</p> <ol> <li><code>type</code> (REQUIRED, string). The type of the graph plot, for example \"partial_dependence\".</li> <li><code>name</code> (OPTIONAL, string). A pretty name of the graph, for example \"Partial Dependence Plot\".</li> <li> <p><code>results</code> (REQUIRED, list). Results contains the graph plot data. Each graph can depend on a specific    output class and feature. There can be multiple results. For each result the following fields are    present.</p> <ol> <li><code>class</code> (OPTIONAL, string/int/float/bool). The output class name that the graph corresponds to.    This field is not always present.</li> <li><code>feature</code> (REQUIRED, string). The feature the graph corresponds to. This is required, since all    relevant graphs are dependent on features.</li> <li> <p><code>data</code> (REQUIRED, list)</p> <ol> <li><code>x_value</code> (REQUIRED, float). The $x$-value of the graph.</li> <li><code>y_value</code> (REQUIRED, float). The $y$-value of the graph.</li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/latest/#assessment_card","title":"<code>assessment_card</code>","text":"<p>An <code>assessment_card</code> contains the following information.</p> <ol> <li> <p><code>provenance</code> (OPTIONAL). In case this Assessment Card is generated from another source file, this field can capture    the historical context of the contents of this Assessment Card.</p> <ol> <li><code>git_commit_hash</code> (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used    to create this card.</li> <li><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.    Timestamp should be given, preferably in UTC (represented as <code>Z</code>), in    ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</li> <li><code>uri</code> (OPTIONAL, string). URI to the tool that was used to perform the transformations.</li> <li><code>author</code> (OPTIONAL, string). Name of person that initiated the transformations.</li> </ol> </li> <li> <p><code>name</code> (REQUIRED, string). The name of the assessment.</p> </li> <li><code>urn</code> (OPTIONAL, string). A Uniform Resource Name (URN) of the instrument in the instrument register.</li> <li><code>date</code> (REQUIRED, string). The date at which the assessment is completed. Date should be given in    ISO 8601 format, i.e. <code>YYYY-MM-DD</code>.</li> <li> <p><code>contents</code> (REQUIRED, list). There can be multiple items in contents. For each item the following fields are present:</p> <ol> <li><code>question</code> (REQUIRED, string). A question.</li> <li><code>urn</code> (OPTIONAL, string). A Uniform Resource Name (URN) of the corresponding task in the instrument register.</li> <li><code>answer</code> (REQUIRED, string). An answer.</li> <li><code>remarks</code> (OPTIONAL, string). A field to put relevant discussion remarks in.</li> <li> <p><code>authors</code> (OPTIONAL, list). There can be multiple names. For each name the following field is present.</p> <ol> <li><code>name</code> (OPTIONAL, string). The name of the author of the question.</li> </ol> </li> <li> <p><code>timestamp</code> (OPTIONAL, string). A timestamp of the date, time and timezone of the answer. Timestamp should be     given, preferably in UTC (represented as <code>Z</code>), in     ISO 8601 format, i.e. <code>2024-04-16T16:48:14Z</code>.</p> </li> </ol> </li> </ol>"},{"location":"projects/tad/reporting-standard/latest/#example","title":"Example","text":""},{"location":"projects/tad/reporting-standard/latest/#system-card","title":"System Card","text":"<pre><code>version: {system_card_version}\nprovenance:\n  git_commit_hash: {git_commit_hash}\n  timestamp: {modification_timestamp}\n  uri: {modification_uri}\n  author: {modification_author}\nname: {system_name}\nupl: {upl_uri}\nowners:\n  - oin: {oin}\n    organization: {organization_name}\n    name: {owner_name}\n    email: {owner_email}\n    role: {owner_role}\ndescription: {system_description}\nlabels:\n  - name: {label_name}\n    value: {label_value}\nstatus: {system_status}\npublication_category: {system_publication_cat}\nbegin_date: {system_begin_date}\nend_date: {system_end_date}\ngoal_and_impact: {system_goal_and_impact}\nconsiderations: {system_considerations}\nrisk_management: {system_risk_management}\nhuman_intervention: {system_human_intervention}\nlegal_base:\n  - name: {law_name}\n    link: {law_uri}\nused_data: {system_used_data}\ntechnical_design: {technical_design}\nexternal_providers:\n  - name: {name_external_provider}\n    version: {version_external_provider}\nreferences:\n  - {reference_uri}\ninteraction_details:\n  - {system_interaction_details}\nversion_requirements:\n  - {system_version_requirements}\ndeployment_variants:\n  - {system_deployment_variants}\nhardware_requirements:\n  - {system_hardware_requirements}\nproduct_markings:\n  - {system_product_markings}\nuser_interface:\n  - description: {system_user_interface}\n    link: {system_user_interface_uri}\n    snapshot: {system_user_interface_snapshot_uri}\n\nmodels:\n  - !include {model_card_uri}\n\nassessments:\n  - !include {assessment_card_uri}\n</code></pre>"},{"location":"projects/tad/reporting-standard/latest/#model-card","title":"Model Card","text":"<pre><code>provenance:\n  git_commit_hash: {git_commit_hash}\n  timestamp: {modification_timestamp}\n  uri: {modification_uri}\n  author: {modification_author}\nlanguage:\n  - {lang_0}\nlicense:\n  license_name: {license_name}\n  license_link: {license_uri}\ntags:\n  - {tag_0}\nowners:\n  - oin: {oin}\n    organization: {organization_name}\n    name: {owner_name}\n    email: {owner_email}\n    role: {owner_role}\n\nmodel-index:\n  - name: {model_id}\n    model: {model_uri}\n    artifacts:\n      - uri: {model_artifact_uri}\n      - content-type: {model_artifact_type}\n      - md5-checksum: {md5_checksum}\n    parameters:\n      - name: {parameter_name}\n        dtype: {parameter_dtype}\n        value: {parameter_value}\n        labels:\n          - name: {label_name}\n            dtype: {label_type}\n            value: {label_value}\n    results:\n      - task:\n          - type: {task_type}\n            name: {task_name}\n        datasets:\n          - type: {dataset_type}\n            name: {dataset_name}\n            split: {split}\n            features:\n              - {feature_name}\n            revision: {dataset_version}\n        metrics:\n          - type: {metric_type}\n            name: {metric_name}\n            dtype: {metric_dtype}\n            value: {metric_value}\n            labels:\n              - name: {label_name}\n                type: {label_type}\n                dtype: {label_type}\n                value: {label_value}\n        measurements:\n          bar_plots:\n            - type: {measurement_type}\n              name: {measurement_name}\n              results:\n                - name: {bar_name}\n                  value: {bar_value}\n          graph_plots:\n            - type: {measurement_type}\n              name: {measurement_name}\n              results:\n                - class: {class_name}\n                  feature: {feature_name}\n                  data:\n                    - x_value: {x_value}\n                      y_value: {y_value}\n</code></pre>"},{"location":"projects/tad/reporting-standard/latest/#assessment-card","title":"Assessment Card","text":"<pre><code>provenance:\n  git_commit_hash: {git_commit_hash}\n  timestamp: {modification_timestamp}\n  uri: {modification_uri}\n  author: {modification_author}\nname: {assessment_name}\nurn: {urn}\ndate: {assessment_date}\ncontents:\n  - question: {question_text}\n    urn: {urn}\n    answer: {answer_text}\n    remarks: {remarks_text}\n    authors:\n      - name: {author_name}\n    timestamp: {timestamp}\n</code></pre>"},{"location":"projects/tad/reporting-standard/latest/#schema","title":"Schema","text":"<p>JSON schema will be added when we publish the first beta version.</p>"},{"location":"projects/tad/reporting-standard/latest/#changelog","title":"Changelog","text":"<ul> <li>0.1a7: adds urn to assessment card</li> <li>0.1a6:<ul> <li>fix mismatches between description and examples</li> <li>format YAML examples and Markdown formatting</li> </ul> </li> <li>0.1a5: adds a general description of the technical documentation required for high-risk systems to conform to the EU   AI Act.</li> <li>0.1a4: adds data provenance</li> <li>0.1a3: require ISO 8601 timestamp</li> <li>0.1a2: introduces typed artifacts</li> <li>0.1a1: initial draft version of this reporting standard</li> </ul> <ol> <li> <p>Deviation from the Hugging Face specification is in the License field. Hugging Face only accepts dataset id's   from Hugging Face license list while we accept any   license from Open Source License List.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the <code>model_index:results:dataset</code> field. Hugging Face only   accepts one dataset, while we accept a list of datasets.\u00a0\u21a9\u21a9</p> </li> <li> <p>Deviation from the Hugging Face specification is in the Dataset Type field. Hugging Face only accepts dataset id's   from Hugging Face datasets while we also allow for any url pointing to the dataset.\u00a0\u21a9\u21a9</p> </li> <li> <p>For this extension to work relevant metrics (such as for example false positive rate) have to be added to the   Hugging Face metrics, possibly this can be done in our organizational namespace.\u00a0\u21a9\u21a9</p> </li> </ol>"},{"location":"way-of-working/code-reviews/","title":"Code reviews","text":"<p>The purpose of a code review is to ensure the quality, readability, and that all requirements from the ticket have been met for a change before it gets merged into the main codebase. Additionally, code reviews are a communication tool, they allow team members to stay aware of changes being made.</p> <p>Code reviews involve having a team member examine the changes made by another team member and give feedback or ask questions if needed.</p>"},{"location":"way-of-working/code-reviews/#creating-a-pull-request","title":"Creating a Pull Request","text":"<p>We use GitHub pull requests (PR) for code reviews. You can make a draft PR if your work is still in progress. When you are done you can remove the draft status. A team member may start reviewing when the PR does not have a draft status.</p> <p>For team ADRs at least 3 accepting reviews are required, or all team members should accept if it can be expected that the ADR is controversial.</p> <p>A team ADR is an ADR made in the ai-validation repository.</p> <p>All other PRs only need at least 1 reviewer to get accepted, but can have more reviewers if desired (by either reviewer or author).</p>"},{"location":"way-of-working/code-reviews/#review-process","title":"Review process","text":"<p>By default the codeowner, indicated in the CODEOWNER file, will be requested to review. For us this is the GitHub team AI-validation. If the PR creator wants a specific team member to review, the PR creator should add the team member specifically in the reviewers section of the PR. A message in Mattermost will be posted for PRs. Then with the reaction of an emoji a reviewer will indicate they are looking at the PR.</p> <p>If the reviewer has suggestions or comments the PR creator can fix those or add comments to the suggestions. When the creator of the PR thinks he is done with the feedback he must re-request a review from the person that did the review. The reviewer must then look at the changes and approve or add more comments. This process continues until the reviewer agrees that all is correct and approves the PR.</p> <p>Once the review is approved the reviewer checks if the branch is in sync with the main branch before merging. If not, the reviewer rebases the branch. Once the branch is in sync with main the reviewer merges the PR and checks if the deployment is successful. If the deployment is not successful the reviewer fixes it. If the PR needs more than one review, the last accepting reviewer merges the PR.</p>"},{"location":"way-of-working/contributing/","title":"Contributing to AI Validation","text":"<p>First off, thanks for taking the time to contribute! \u2764\ufe0f</p> <p>All types of contributions are encouraged and valued. See the Table of Contents for different ways to help and details about how this project handles them. Please make sure to read the relevant section before making your contribution. It will make it a lot easier for us maintainers and smooth out the experience for all involved. The community looks forward to your contributions. \ud83c\udf89</p>"},{"location":"way-of-working/contributing/#table-of-contents","title":"Table of Contents","text":"<ul> <li>Code of Conduct</li> <li>I Have a Question</li> <li>I Want To Contribute</li> <li>Reporting Bugs</li> <li>Suggesting Enhancements</li> <li>Styleguides</li> <li>Commit Messages</li> </ul>"},{"location":"way-of-working/contributing/#code-of-conduct","title":"Code of Conduct","text":"<p>This project and everyone participating in it is governed by the Code of Conduct. By participating, you are expected to uphold this code. Please report unacceptable behavior to ai-validatie@minbzk.nl.</p>"},{"location":"way-of-working/contributing/#i-have-a-question","title":"I Have a Question","text":"<p>Before you ask a question, it is best to search for existing Issues that might help you. In case you have found a suitable issue and still need clarification, you can write your question in this issue.</p> <p>If you then still feel the need to ask a question and need clarification, we recommend the following:</p> <ul> <li>Open an Issue.</li> <li>Provide as much context as you can about what you're running into.</li> </ul> <p>We will then take care of the issue as soon as possible.</p>"},{"location":"way-of-working/contributing/#i-want-to-contribute","title":"I Want To Contribute","text":""},{"location":"way-of-working/contributing/#legal-notice","title":"Legal Notice","text":"<p>When contributing to this project, you must agree that you have authored 100% of the content, that you have the necessary rights to the content and that the content you contribute may be provided under the project license.</p>"},{"location":"way-of-working/contributing/#reporting-bugs","title":"Reporting Bugs","text":""},{"location":"way-of-working/contributing/#before-submitting-a-bug-report","title":"Before Submitting a Bug Report","text":"<p>A good bug report shouldn't leave others needing to chase you up for more information. Therefore, we ask you to investigate carefully, collect information and describe the issue in detail in your report. Please complete the following steps in advance to help us fix any potential bug as fast as possible.</p> <ul> <li>Make sure that you are using the latest version.</li> <li>To see if other users have experienced (and potentially already solved) the same issue you are having, check if there is not already a bug report existing for your bug or error in the bug tracker.</li> <li>Collect information about the bug</li> <li>Possibly your input and the output</li> </ul>"},{"location":"way-of-working/contributing/#how-do-i-submit-a-good-bug-report","title":"How Do I Submit a Good Bug Report?","text":"<p>You must never report security related issues, vulnerabilities or bugs including sensitive information to the issue tracker, or elsewhere in public. Instead sensitive bugs must be sent by email to ai-validatie@minbzk.nl.</p> <p>We use GitHub issues to track bugs and errors. If you run into an issue with the project:</p> <ul> <li>Open an Issue. (Since we can't be sure at this point whether it   is a bug or not, we ask you not to talk about a bug yet and not to label the issue.)</li> <li>Explain the behavior you would expect and the actual behavior.</li> <li>Please provide as much context as possible and describe the reproduction steps that someone else can follow to   recreate the issue on their own. This usually includes your code.</li> <li>Provide the information you collected in the previous section.</li> </ul> <p>Once it's filed:</p> <ul> <li>The project team will label the issue accordingly.</li> <li>A team member will try to reproduce the issue with your provided steps. If there are no reproduction steps or no   obvious way to reproduce the issue, the team will ask you for those steps and mark the issue as <code>needs-repro</code>. Bugs   with the <code>needs-repro</code> tag will not be addressed until they are reproduced.</li> <li>If the team is able to reproduce the issue, it will be marked <code>needs-fix</code>, as well as possibly other tags (such as   <code>critical</code>), and the issue will be left to be implemented by someone.</li> </ul>"},{"location":"way-of-working/contributing/#suggesting-enhancements","title":"Suggesting Enhancements","text":"<p>This section guides you through submitting an enhancement suggestion for this project, including completely new features and minor improvements. Following these guidelines will help maintainers and the community to understand your suggestion and find related suggestions.</p>"},{"location":"way-of-working/contributing/#before-submitting-an-enhancement","title":"Before Submitting an Enhancement","text":"<ul> <li>Make sure that you are using the latest version.</li> <li>Perform a search to see if the enhancement has already been   suggested. If it has, add a comment to the existing issue instead of opening a new one.</li> <li>Find out whether your idea fits with the scope and aims of the project. It's up to you to make a strong case to   convince the project's developers of the merits of this feature. Keep in mind that we want features that will be   useful to the majority of our users and not just a small subset.</li> </ul>"},{"location":"way-of-working/contributing/#how-do-i-submit-a-good-enhancement-suggestion","title":"How Do I Submit a Good Enhancement Suggestion?","text":"<p>Enhancement suggestions are tracked as GitHub issues.</p> <ul> <li>Use a clear and descriptive title for the issue to identify the suggestion.</li> <li>Describe the current behavior and explain which behavior you expected to see instead and why. At this point   you can also tell which alternatives do not work for you.</li> <li>You may want to include screenshots and animated GIFs which help you demonstrate the steps or point out the part   which the suggestion is related to. You can use this tool to record GIFs on MacOS   and Windows, and this tool or   this tool on Linux.</li> <li>Explain why this enhancement would be useful for the community. You may also want to point out the   other projects that solved it better and which could serve as inspiration.</li> </ul>"},{"location":"way-of-working/contributing/#styleguides","title":"Styleguides","text":""},{"location":"way-of-working/contributing/#commit-messages","title":"Commit Messages","text":"<p>We have commit message conventions: Commit convention</p>"},{"location":"way-of-working/contributing/#markdown-lint","title":"Markdown Lint","text":"<p>We use Markdown lint to standardize Markdown: Markdown lint config.</p>"},{"location":"way-of-working/contributing/#pre-commit","title":"Pre-commit","text":"<p>We use pre-commit to enabled standardization: pre-commit config.</p>"},{"location":"way-of-working/decision-log/","title":"Decision Log","text":"<p>Throughout our work, small decisions about processes and approaches are often made in meetings and chats. While these aren't big enough for formal documentation like ADRs, capturing them is valuable for both current and future team members.</p> <p>This log provides a reference point for those decisions.</p>"},{"location":"way-of-working/decision-log/#overview-of-decisions","title":"Overview of decisions","text":"<ul> <li>We keep project ADRs in the relevant project.</li> <li>When we encounter a bug in our software, we create a regression test to reproduce it.  When the bug is solved, the regression test is updated to reflect the solved situation.</li> </ul>"},{"location":"way-of-working/off-boarding/","title":"Off Boarding","text":"<p>We're sad to see you go! But if you do, here's what not to forget.</p>"},{"location":"way-of-working/off-boarding/#github","title":"GitHub","text":"<ul> <li>Make sure you transfer any ownership you may have to the team.</li> <li>Make sure to leave our team  AI Validation Team.</li> <li>If you are leaving MinBZK altogether, make sure to leave MinBZK Org.</li> </ul>"},{"location":"way-of-working/off-boarding/#kubernetes","title":"Kubernetes","text":"<ul> <li>Make sure your access to our Kubernetes clusters is removed.</li> </ul>"},{"location":"way-of-working/off-boarding/#collaboration-space","title":"Collaboration space","text":"<ul> <li>Leave our collaboration space Team Collaboration Space.</li> </ul>"},{"location":"way-of-working/off-boarding/#pleio-community","title":"Pleio community","text":"<ul> <li>Make sure you transfer any ownership you may have to someone else in the team.</li> </ul>"},{"location":"way-of-working/off-boarding/#shared-mailbox","title":"Shared mailbox","text":"<ul> <li>Make sure to leave our  shared mailbox by sending an email to Secretariat of Digital   Society (can be found in the Outlook address book).</li> </ul>"},{"location":"way-of-working/off-boarding/#teams-page","title":"Teams page","text":"<ul> <li>Move yourself to our alumni section on the team page.</li> </ul>"},{"location":"way-of-working/off-boarding/#webex","title":"Webex","text":"<ul> <li>Leave our Webex Team.</li> </ul>"},{"location":"way-of-working/off-boarding/#signal","title":"Signal","text":"<ul> <li>Leave all the relevant Signal groups.</li> </ul>"},{"location":"way-of-working/off-boarding/#mattermost-chat","title":"Mattermost chat","text":"<ul> <li>Leave our private Mattermost channels.</li> <li>You are welcome to stay in our \"Mattermost team\" and in the public channels there.</li> </ul>"},{"location":"way-of-working/principles/","title":"Our Principles","text":"<ol> <li>Our strong trust in the government and the dedication of people at all levels within the government organization is the basis of our actions.</li> <li>The interests of the citizen and society take precedence in all our activities.</li> <li>Learning and knowledge sharing are central: we encourage team members to take on tasks that are new or less familiar to them.</li> <li>Existing knowledge, policies, and proven methods are actively reused and shared.</li> <li>We strive for maximum openness and transparency in all our processes.</li> <li>We prefer the use and creation of Open Source Software.</li> <li>Our team members can choose to work anonymously.</li> <li>We treat each other with respect.</li> <li>Collaboration is essential to our success; we actively seek collaboration with both public and private partners.</li> </ol>"},{"location":"way-of-working/ubiquitous_language/","title":"Ubiquitous Language","text":"<p>For clarity and consistency, this document defines some terms used within our team where the meaning in Data Science or Computer Science differs, and terms that are for any reason good to mention.</p> <p>For a full reference for Machine Learning, we recommend ML Fundamentals from Google.</p>"},{"location":"way-of-working/onboarding/","title":"Onboarding","text":"<ul> <li>Start by setting up your dev machine.</li> <li>Then create your accounts.</li> <li>Read our ADRs.</li> <li>Read our principles.</li> <li>Read our Decision Log.</li> <li>Finally, add yourself to our team page (you can stay anonymous if you want, see   our principles).</li> </ul>"},{"location":"way-of-working/onboarding/accounts/","title":"Accounts","text":""},{"location":"way-of-working/onboarding/accounts/#mattermost-chat","title":"Mattermost chat","text":"<p>Make sure you have installed Mattermost, then follow these steps.</p> <ul> <li>Register on Pleio with your @rijksoverheid/@minbzk.nl email address.</li> <li>Login-&gt; Gitlab -&gt; Pleio.</li> <li>Ask a team member to add you to the RIG and AI Validation team.</li> <li>Make sure to add a recognizable profile picture</li> </ul>"},{"location":"way-of-working/onboarding/accounts/#webex","title":"Webex","text":"<p>Make sure you have installed Webex, then follow these steps.</p> <ul> <li>Create an account with your @rijksoverheid/@minbzk.nl email address.</li> <li>Ask a team member to add to you to the AI Validation team.</li> </ul>"},{"location":"way-of-working/onboarding/accounts/#tuple","title":"Tuple","text":"<p>Make sure you have installed Tuple, then follow these steps.</p> <ul> <li>Create an account with your @rijksoverheid/@minbzk.nl email address.</li> <li>Ask a team member to add to you to the AI Validation team.</li> </ul>"},{"location":"way-of-working/onboarding/accounts/#github","title":"GitHub","text":"<p>Create or use your existing GitHub account.</p> <ul> <li>Add your @rijksoverheid/@minbzk.nl email address to your account.</li> <li>Or create a new account (anonymous if you want, see our Principles)</li> <li>Make sure to add a profile picture</li> <li>Ask a team member to add you to the MinBZK Org.</li> <li>Ask a team member to add to you to the AI Validation Team.</li> </ul>"},{"location":"way-of-working/onboarding/accounts/#collaboration-space","title":"Collaboration space","text":"<ul> <li>Ask any team member to add you to the Team Collaboration Space.</li> </ul>"},{"location":"way-of-working/onboarding/accounts/#open-up-your-calendar","title":"Open up your calendar","text":"<ul> <li>In Outlook, right-click your calendar</li> <li>Properties</li> <li>Enable read access</li> </ul>"},{"location":"way-of-working/onboarding/accounts/#shared-mailbox","title":"Shared mailbox","text":"<ul> <li>Ask a colleague to add you to the shared contact address by sending an email to Secretariat   of Digital Society.</li> <li>In Outlook, go to Account Settings and \"Add Account\", leave the password empty</li> </ul>"},{"location":"way-of-working/onboarding/accounts/#bookmark","title":"Bookmark","text":"<p>Bookmark these links in your browser:</p> <ul> <li>Team Collaboration Space</li> <li>Sprint Board</li> <li>Webex Room</li> </ul>"},{"location":"way-of-working/onboarding/accounts/#secrets","title":"Secrets","text":"<p>We use HashiCorp Vault secrets manager for team secrets. You can login with a GitHub Personal access token. The token needs organization read permissions (<code>read:org</code>), and you should be part of our GitHub team to access the vault.</p>"},{"location":"way-of-working/onboarding/dev-machine/","title":"Setting up your Dev Machine","text":"<p>We are assuming your dev machine is a Mac. This guide is rather opinionated, feel free to have your own opinion, and feel free to contribute! Contributing can be done by clicking \"edit\" top right and by making a pull request on this repository.</p>"},{"location":"way-of-working/onboarding/dev-machine/#things-that-should-have-been-default-on-mac","title":"Things that should have been default on Mac","text":"<ul> <li>Keep awake with Amphetamine</li> <li>Office DisplayLink software</li> <li> <p>Homebrew as the missing Package Manager</p> <pre><code>/bin/bash -c \"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)\"\n</code></pre> </li> <li> <p>Rectangle</p> <pre><code>brew install --cask rectangle\n</code></pre> </li> </ul>"},{"location":"way-of-working/onboarding/dev-machine/#citrix-workspace","title":"Citrix workspace","text":"<ul> <li>Citrix workspace</li> <li>Flex2Rijk to login</li> </ul>"},{"location":"way-of-working/onboarding/dev-machine/#communication","title":"Communication","text":"<ul> <li> <p>WebEx for video conferencing</p> <pre><code>brew install --cask webex\n</code></pre> </li> <li> <p>Mattermost for team communication</p> <pre><code>brew install --cask mattermost\n</code></pre> </li> </ul>"},{"location":"way-of-working/onboarding/dev-machine/#terminal-and-shell","title":"Terminal and shell","text":"<ul> <li> <p>Iterm2</p> <pre><code>brew install --cask iterm2\n</code></pre> </li> <li> <p>Oh My Zsh</p> <pre><code>/bin/bash -c \"$(curl -fsSL https://raw.githubusercontent.com/ohmyzsh/ohmyzsh/master/tools/install.sh)\"\n</code></pre> </li> <li> <p>Autosuggestions for zsh</p> <pre><code>git clone https://github.com/zsh-users/zsh-autosuggestions ~/.oh-my-zsh/custom/plugins/zsh-autosuggestions\n</code></pre> </li> <li> <p>Fish shell like syntax highlighting for Zsh</p> <pre><code>brew install zsh-syntax-highlighting\n</code></pre> </li> <li> <p>Add plugins to your shell in <code>~/.zshrc</code></p> <pre><code>plugins = (\n    # other plugins...\n    zsh-autosuggestions\n    kubectl\n    docker\n    docker-compose\n    pyenv\n    z\n)\n</code></pre> </li> <li> <p>Touch ID in Terminal</p> </li> </ul>"},{"location":"way-of-working/onboarding/dev-machine/#coding","title":"Coding","text":"<ul> <li> <p>Sourcetree</p> <pre><code>brew install --cask sourcetree\n</code></pre> </li> <li> <p>Pyenv</p> <pre><code>brew install pyenv\n</code></pre> </li> <li> <p>pyenv virtualenv</p> <pre><code>brew install pyenv-virtualenv\n</code></pre> </li> <li> <p>pre-commit</p> <pre><code>brew install pre-commit\n</code></pre> </li> <li> <p>Xcode Command Line Tools</p> <pre><code>xcode-select --install\n</code></pre> </li> <li> <p>TabbyML Opensource, self-hosted AI coding assistant</p> <p>We can not just use hosted versions of coding assistants because of privacy and copyright issues. We can however use self-hosted coding assistants provided they are trained on data with permissive licenses.</p> <p>StarCoder (1-7B) models are all trained on version 1.2 of The Stack dataset. It boils down to all open GitHub code with permissive licenses (193 licenses in total). Minus opt-out requests.</p> <p>Code Lama and Deepseek models are not clear enough about their data licenses.</p> <pre><code>brew install tabbyml/tabby/tabby\ntabby serve --device metal --model TabbyML/StarCoder-3B\n</code></pre> <p>Then configure your IDE by installing a plugin.</p> </li> <li> <p>Sign commits using SSH</p> </li> </ul>"}]}
\ No newline at end of file
diff --git a/sitemap.xml b/sitemap.xml
index 226fe98d..5e47f0c2 100644
--- a/sitemap.xml
+++ b/sitemap.xml
@@ -2,257 +2,257 @@
 <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
     <url>
          <loc>https://minbzk.github.io/ai-validation/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/about/contact/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/about/team/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/adrs/0001-adrs/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/adrs/0002-code-platform/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/adrs/0003-ci-cd/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/adrs/0004-software-hosting-platform/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/adrs/0005-python-tooling/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/adrs/0006-agile-tooling/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/adrs/0007-commit-convention/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/adrs/0008-architectural-diagram-tooling/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/adrs/0010-container-registry/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/adrs/0011-researcher-in-residence/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/adrs/0012-dictionary-for-spelling/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/adrs/0013-date-time-representation/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/adrs/0014-written-language/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/adrs/0016-government-cloud-comparison/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/llm-benchmarks/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/comparison/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/adrs/0001-adrs/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/adrs/0002-tad-reporting-standard/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/adrs/0003-tad-tool/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/adrs/0004-software-stack/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/adrs/0005-ai-verify-technical-tests/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/adrs/0006-extend-system-card-EU-AI-Act/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/existing-tools/checklists/ai_assesment_tool_checklist/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/existing-tools/checklists/aiverify_checklist/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/existing-tools/checklists/holisticai_checklist/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/existing-tools/checklists/ibm_360_research_toolkit_checklist/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/existing-tools/checklists/verifyml_checklist/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/existing-tools/comparison/requirements/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/existing-tools/comparison/tools/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/existing-tools/comparison/tools_comparison/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/reporting-standard/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/reporting-standard/0.1a1/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/reporting-standard/0.1a2/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/reporting-standard/0.1a3/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/reporting-standard/0.1a4/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/reporting-standard/0.1a5/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/reporting-standard/0.1a6/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/projects/tad/reporting-standard/latest/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/way-of-working/code-reviews/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/way-of-working/contributing/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/way-of-working/decision-log/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/way-of-working/off-boarding/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/way-of-working/principles/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/way-of-working/ubiquitous_language/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/way-of-working/onboarding/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/way-of-working/onboarding/accounts/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
     <url>
          <loc>https://minbzk.github.io/ai-validation/way-of-working/onboarding/dev-machine/</loc>
-         <lastmod>2024-07-10</lastmod>
+         <lastmod>2024-07-13</lastmod>
          <changefreq>daily</changefreq>
     </url>
 </urlset>
\ No newline at end of file
diff --git a/sitemap.xml.gz b/sitemap.xml.gz
index e675fd50..e850bdef 100644
Binary files a/sitemap.xml.gz and b/sitemap.xml.gz differ
diff --git a/way-of-working/decision-log/index.html b/way-of-working/decision-log/index.html
index a4be7209..3b3dd766 100644
--- a/way-of-working/decision-log/index.html
+++ b/way-of-working/decision-log/index.html
@@ -2145,6 +2145,8 @@ <h1 id="decision-log">Decision Log</h1>
 <h2 id="overview-of-decisions">Overview of decisions</h2>
 <ul>
 <li>We keep project ADRs in the relevant project.</li>
+<li>When we encounter a bug in our software, we create a regression test to reproduce it.
+ When the bug is solved, the regression test is updated to reflect the solved situation.</li>
 </ul>
 
 
@@ -2168,7 +2170,7 @@ <h2 id="overview-of-decisions">Overview of decisions</h2>
     <span class="md-icon" title="Last update">
       <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="M21 13.1c-.1 0-.3.1-.4.2l-1 1 2.1 2.1 1-1c.2-.2.2-.6 0-.8l-1.3-1.3c-.1-.1-.2-.2-.4-.2m-1.9 1.8-6.1 6V23h2.1l6.1-6.1-2.1-2M12.5 7v5.2l4 2.4-1 1L11 13V7h1.5M11 21.9c-5.1-.5-9-4.8-9-9.9C2 6.5 6.5 2 12 2c5.3 0 9.6 4.1 10 9.3-.3-.1-.6-.2-1-.2s-.7.1-1 .2C19.6 7.2 16.2 4 12 4c-4.4 0-8 3.6-8 8 0 4.1 3.1 7.5 7.1 7.9l-.1.2v1.8Z"/></svg>
     </span>
-    <span class="git-revision-date-localized-plugin git-revision-date-localized-plugin-date">April 30, 2024</span>
+    <span class="git-revision-date-localized-plugin git-revision-date-localized-plugin-date">July 12, 2024</span>
   </span>
 
     
@@ -2196,6 +2198,11 @@ <h2 id="overview-of-decisions">Overview of decisions</h2>
     
     <nav>
       
+        <a href="https://github.com/uittenbroekrobbert" class="md-author" title="@uittenbroekrobbert">
+          
+          <img src="../../assets/external/avatars.githubusercontent.com/u/159022774.fe356448.jpg" alt="uittenbroekrobbert">
+        </a>
+      
         <a href="https://github.com/robbertbos" class="md-author" title="@robbertbos">
           
           <img src="../../assets/external/avatars.githubusercontent.com/u/11290343.fe356448.jpg" alt="robbertbos">