From a7626c27baa7f27f11b13abe0b6a9ac8d2955fa3 Mon Sep 17 00:00:00 2001 From: Jeremy Cohen Date: Wed, 7 Aug 2024 20:22:29 +0100 Subject: [PATCH 01/10] Minor updates to the paper up to end of 4.3.4. --- competencies.md | 119 +++++++++++++++++++++++++++--------------------- 1 file changed, 68 insertions(+), 51 deletions(-) diff --git a/competencies.md b/competencies.md index a33cab43..c47624bd 100644 --- a/competencies.md +++ b/competencies.md @@ -124,9 +124,10 @@ They are now vital elements of the research process across almost all domains. They enable researchers to collect and process ever-increasing amounts of data, simulate a wide range of physical phenomena across previously unexplored scales of the universe, and discover previously inconceivably complex structures in nature and societies via \ac{ML}. -This prevalence of computations in research means that +This prevalence of computation in research means that digital skills are now required by researchers at all -career levels, and in fields significantly beyond the previously expected. +career levels, and in fields significantly beyond those that would previously +have been expected. Research software is now used and developed not only in \ac{STEM} domains, but also in other fields, like medicine and the humanities. @@ -159,7 +160,7 @@ organisations over the past decade, or they may be embedded within a research team. They may have a job title that officially recognises them as an RSE, or they may have a standard research or technical job title such as Research Assistant, Research Fellow, or Software Engineer. Regardless of their job title, -RSEs share a set of core skills that are required to write software, understand +RSEs share a set of core skills that are required to design and develop research software, understand the research environment, and ensure that they produce sustainable, maintainable code that supports reproducible research outputs, following the \ac{FAIR} principles [@FAIR4RS]. @@ -174,7 +175,7 @@ research outputs and, ultimately, society as a whole. These competencies draw upon skills from traditional SE practice, established research culture, and the commitment to being part of a team. However, we see this set of skills as a foundation to build upon. -We envision that through specialised training the set of skills +We envision that through specialised training, the set of skills of graduate RSEs and domain researchers will grow. This is underlined by a growing interest to perform RSE research, i.e. research into methods and tools more catered to the unique @@ -183,13 +184,15 @@ challenges that research software provides. would have chosen different competencies, we will argue that this trifecta of skills is exactly what is needed for modern, software-enabled research. --> -While this community paper is based on workshop discussions that were attended largely by Research Software Engineers (deRSE23 in Paderborn, un-deRSE23 in Jena, Germany), +While this community paper is based on workshop discussions that were attended +largely by Research Software Engineers (deRSE23 in Paderborn, un-deRSE23 in +Jena and deRSE24 in Würzburg, Germany), we believe that the competencies formulated here can offer far-reaching -impact beyond the domain of RSE into adjacent fields of science and, indeed, the wider research community, -since most research involves some amount -of data management, processing and visualisation. -At the same time, funding bodies and computing infrastructure providers -prioritise projects that generate archived, +impact beyond the domain of RSE into adjacent aspects of research such as +research data. +For example, this is especially important given that much research involves some amount of +data management, processing and visualisation, and funding bodies and computing +infrastructure providers will sometimes prioritise projects that generate archived, annotated, re-usable, and potentially remotely executable data. In particular, funding agencies and research managers will find the discussion in this paper valuable in order to discover where RSEs see their place in the existing landscape of scientific domains @@ -302,7 +305,7 @@ biology-focused example of a competency portal. In addition to collecting information on a range of competencies that can be browsed within the web-based tool, it also provides career profiles for roles within the domains that \ac{EMBL-EBI} focuses on. The hub provides -access to variety of training resources that are linked to the specific +access to a variety of training resources that are linked to the specific competencies that they relate to. This enables learners to more easily find the right training materials in order to support their career development journey, helping them to identify what they might want to @@ -336,10 +339,9 @@ the \acrfull{HIFIS} [@HIFIS], and the already mentioned SURESOFT [@SURESOFTLin # Values {#sec:values} -The activities of an RSE are guided by ethical values. -In addition to the values for good scientific practice [@dfg_gsp], RSEs also adhere to -the \ac{SE} Code of Ethics [@Gotterbarn1999]. -Central to that code is the RSE's obligation to +In addition to the values for good scientific practice [@dfg_gsp], RSEs should also adhere to +the \ac{SE} Code of Ethics [@Gotterbarn1999] and their activities should be guided by ethical values. +Central to the \ac{SE} Code of Ethics is the RSE's obligation to commit to the health, safety and welfare of the public and act in the interest of society, their employer and their clients. Further values loosely based on that code include the obligations @@ -358,12 +360,13 @@ Humphreys [@humphreys_extending_2004] regards this development as "more importa The epistemological status of computer modelling and simulation is still the subject of debate, which ranges from the postulate of a new process of knowledge creation that has its own, unique, epistemology [@winsberg_sanctioning_1999] to the perception that from a philosophy of science perspective, there is nothing really new [@frigg_philosophy_2009]. -In any case it is clear, that the construction of a simulation-model, -including the selection of the salient characteristics of the system to be modelled, +In any case, it is clear that a number of decisions in the construction of a simulation-model +will have a decisive impact on the adequacy for purpose [@bokulich_data_2021] of the model. +These decisions include the selection of the salient characteristics of the system to be modelled, the choice of the mathematical representation of the processes to be represented, the choice of numerical methods and other algorithms -and even including the design of the user-interface -will have a decisive impact on the adequacy for purpose [@bokulich_data_2021] of the model. +and even including the design of the user-interface. + The relationship between initial state, inputs and final state of a computer simulation is "epistemically opaque" [@humphreys_extending_2004], in that not every step of the process is directly observable. The current trend of an increasing application of computationally irreducible systems, such as those based on artificial neural networks, @@ -371,15 +374,18 @@ further exacerbates this inherent limitation of explainability. An RSE usually takes a pivotal role in assessing this adequacy for purpose of a model as well as in characterising and communicating the domain of its legitimate application and its limits of interpretability. -This role, together with the enormous reliance on modelling and simulation of scientific results -as well as real-world decision-making -places a larges responsibility on the RSE. +This role, together with the enormous reliance on modelling and simulation of scientific results, +as well as real-world decision-making, +places a large responsibility on the RSE. It is important that RSEs are aware of this responsibility and continuously improve their capabilities to live up to it. -Research software is also well on its way to be ever-present in data-driven research, in all research fields. +Research software is also well on its way to being ever-present in data-driven research, in all research fields. This can probably be most prominently seen by considering software used to analyse data, e.g. within experimental research. It is not unusual for RSEs to support those more research data oriented efforts as well. -Here specifically they closely interact with research data management, by designing research software such as to be better able to adhere to the \ac{FAIR} principles for the research data, but also to follow similar rules for the research software (FAIR4RS [@FAIR4RS]). +Here, specifically, they closely interact with research data management +professionals and practices by designing research software that is better able to +adhere to the \ac{FAIR} principles for research data, but also to follow +similar rules for research software (FAIR4RS [@FAIR4RS]). As such, they are then familiar with special requirements stemming from the field itself, e.g., in medical research, and with privacy related issues especially for personal data, e.g., for conducting surveys. RSEs often assume a multifaceted role at the junction of research, \ac{SE} and data management. @@ -393,7 +399,7 @@ We list some example areas that highlight present-day challenges. ### Handling of data and personal data -A lot of RSE work involves the manipulation or creation of data processing devices. +A lot of RSE work involves the manipulation or creation of data processing tools. We highlight that professional conduct requires these creations to be reliable and to maintain data integrity. In particular, the way that personal data is handled can have far-reaching implications for society. Independent of the encoding into the respective national law in an RSE's jurisdiction, @@ -403,19 +409,19 @@ RSEs need to be aware of this topic's importance and deal with tensions that might arise with researchers' desire for trouble-free sharing of data, thereby expecting openness about the research process, versus the integrity expectations of the society towards \ac{IT} systems. Handling personal data also has ramifications for information security considerations during the software development process. -Data protection is a difficult subject, so RSEs should notice when they need to consult external expertise, for example when dealing with +Data protection is a complex topic, so RSEs should be aware of when they may need to consult external expertise, for example when dealing with special topics such as cryptography or re-identification attacks [@Henriksen2016]. ### Mentoring and diversity {#sec:mentoring-and-diversity} RSEs are often experienced professionals who instruct and work closely with early career researchers. -Similarly to academic supervisors, they bear a certain responsibility to guide and advise younger colleagues +Similarly to academic supervisors, they bear a certain responsibility to guide and advise less-experienced colleagues with respect to career development and the achievement of academic goals. This can take the form of supervising a student or mentoring a fellow RSE. The RSE needs to be aware of the biases arising from the sociological imbalances in research and academia. According to the \ac{UNESCO} Science Report [@Schneegans2021] women account for 33.3% of all researchers. -60.2% of researches come from high-income countries which account for 17.5% of the global population in 2018. +60.2% of researchers come from high-income countries which account for 17.5% of the global population in 2018. Furthermore, the socioeconomic background of academics is not representative of the general population, for example in the US a tenure-track academic is 25 times more likely to have a parent with a PhD [@Morgan2022]. Thereby, to promote their values of an honest, open, and inclusive research space, they should be aware of the diversity problems and help to mitigate them whenever they have the chance to do so. @@ -471,14 +477,15 @@ The role of an RSE lies somewhere on the spectrum between that of a researcher (the "R") and a software engineer (the "SE") and, therefore, requires competencies in both fields. RSEs typically have a background in research or software engineering, but they definitely have obtained broader knowledge in both fields. -They typically apply their knowledge and experience in larger teams, -which allows them to cultivate this hybrid nature. +Even when working as the only RSE on a task or project, they typically apply +their knowledge and experience as part of larger teams of researchers and +technical professionals, which allows them to cultivate this hybrid nature. There are many ways to categorise the competencies of an RSE. We chose to -distribute these competencies over the three pillars to reflect the fact -that RSEs are both competent researchers (the research skills, Section +distribute these competencies over three pillars to reflect the fact +that RSEs are both competent researchers (the research skills, @sec:research-skills) and software engineers (the software/technical skills, -Section @sec:software-skills). The third pillar (communication skills, -Section @sec:communication-skills) forms the bridge between the former two +@sec:software-skills). The third pillar (communication skills, +@sec:communication-skills) forms the bridge between the former two categories, with a particular focus on the software and research cycle and the scientific process. These competencies are relevant in a broad setting and form the foundation for specific specialisations. @@ -495,9 +502,9 @@ and competencies also depends on the size of the RSE team the different RSE specialisations (**RSE specialisations**). -During the Paderborn workshop we asked learners and novice RSEs what they would +During the Paderborn workshop (deRSE23) we asked learners and novice RSEs what they would like to have learnt. The top five items mentioned were: testing, contributing to -large projects, when or why to keep repositories private, proper development, and +large projects, when or why to keep repositories private, high-quality software development, and finding a community. Those topics comprise combinations of the skills and competencies defined below. We will elaborate these in @sec:tasks-and-responsibilities. @@ -507,7 +514,8 @@ competencies defined below. We will elaborate these in \newcommand{\skillsection}[1]{\hypertarget{skills-#1}{% \subsubsection{\glsentrydesc{#1} (\texorpdfstring{\glsentrytext{#1}}{#1})}\label{skills-#1}}} -Besides skilled researchers, RSEs are also competent software engineers. +RSEs are highly-skilled software engineers, in addition to having an +understanding of the research environment. As such, they ideally can solve complex software engineering problems and design software as a user-oriented, future-proof product. The technical skills required by an RSE overlap to a large extent with the common fundamental software engineering skills (see, e.g., @Landwehr2017), @@ -526,7 +534,7 @@ we refer to the Guide to the Software Engineering Body of Knowledge (@swebok_201 Because research software engineering is an interface discipline, RSEs are often stronger in topics more commonly encountered in research software contexts (e.g., mathematical and engineering foundations) -than in other (e.g., software engineering economics). +than in other areas (e.g., software engineering economics). However, they bring a solid level of competence in all software engineering topics. Therefore, RSEs can set and analyse software requirements in the context of open-ended, question-driven research. They can design software so that it can sustainably grow, often @@ -536,7 +544,9 @@ They can formulate and implement various types of tests, they can independently and automate operations of the integration and release process. They can provide working, scalable, and future-proof solutions in a professional context and with common project and software management techniques, adapted to the needs of the research environment. -Finally, as skilled researchers they combine the necessary foundations from their domain with software engineering skills to develop complex software. +Finally, as people who have often gained significant experience of research +in a particular discipline, they combine the necessary foundations from their domain with software +engineering skills to develop complex software. \skillsection{SWLC} @@ -545,7 +555,7 @@ The traditional software development life cycle defines the stages that form the Initial development generally involves an analytic process where requirements and ideas are gathered and analysed (requirements engineering), followed by formulating a plan to fulfil them (design) that is finally turned into running code (implementation). This is accompanied by different measures of quality control (e.g., reviews, testing), validating and verifying -that things work as expected and that they continue to do when development progresses further. Depending on the software project, this can mean a simple "Think-before-you-do", or more elaborate and formal processes. +that things work as expected and that they continue to do so when development progresses further. Depending on the software project, this can mean a simple "Think-before-you-do", or more elaborate and formal processes. Often the development cycles are executed iteratively and incrementally. The life cycle further includes periods of deployment, maintenance and further development (software evolution), as well as software retirement. @@ -560,9 +570,11 @@ and be able to predict and cater to the changing needs of a software project as The RSE should be able to create building blocks from source code that are reusable. This ranges from simple libraries of functions up to complex architectures consisting of multiple software packages. An important part of -reusability is that at least oneself, and ideally others, are able to understand -what a piece of software aims to do and how to enable others to use the provided functionality. This -is primarily achieved through a "clean" implementation and enhanced by +enabling code reusability is the provision of sufficient information in the +form of comments within code, documentation or other means. This is vital to +ensure that developers and maintainers understand what a piece of software aims +to do and how to enable others to use the provided functionality. +This is primarily achieved through a "clean" implementation and enhanced by documentation. Documentation ranges from commenting code blocks to using documentation (building) tools. It should be written with consideration for the different audiences who may need it @@ -604,7 +616,8 @@ especially from a structural point of view. An RSE should understand their output and its implications. An important facet of this capability relates to information security. RSEs need to consider the safety and integrity of personal data and other sensitive information -and make sure that they do not negatively impact the integrity of their institution's network. +and make sure that they do not negatively impact the integrity of their +institution's network and computing infrastruture. ## Research skills {#sec:research-skills} @@ -626,7 +639,8 @@ becomes a motivation to work. \skillsection{RC} -One of the crucial skills of RSEs is their mental proximity to research. +One of the key skills that RSEs have is their understanding of how research +works. They embrace being part of a larger community which, despite friendly competition, shares the common goal of gaining knowledge to disseminate it. @@ -652,7 +666,7 @@ software via software citations and metadata. To evaluate whether the artefacts to be re-used suit their needs, RSEs often need to consider the scientific context of their origin. For example, a paper that references the code under consideration -might be crucial to validate its fitness for purpose or to discredit it. +might be crucial to validate its fitness for purpose or lack of suitability. Code that incorporates research-domain specific knowledge needs to be understood at a very detailed level and its re-use documented to meet standards of good research practice. @@ -667,8 +681,10 @@ research domain at hand. Another part of \ac{FAIR} software is concerned with publishing new and derived works and making them available for re-use by the research community and the general public. -RSEs need to have a basic understanding of common software licence types, such as "proprietary", "copyleft", and "permissive", -their compatibility, and the ramifications for re-using and composing programs. +RSEs need to have a basic understanding of common software licence types, including +proprietary and open source licences and how "copyleft" and "permissive" open +source licences differ. They should also understand compatibility between +different licences, and the ramifications for re-using and composing programs. Beyond that, RSEs will need to properly execute the technicalities of software publishing. These include the application of licences and copyright statements, understanding and assigning software authorship, crediting contributors, @@ -707,7 +723,7 @@ warrants a much stronger emphasis on this field of competence. Being able to work, and effectively communicate in teams is essential for RSEs. For example, RSEs need to be able to explain particular implementation choices made and may even need to defend them. -Within a team of RSEs code reviews improve knowledge transfer and increase team cohesion. +Within a team of RSEs, code reviews improve knowledge transfer and increase team cohesion. The team might change on a project-to-project basis and might be comprised of colleagues with very different backgrounds including, for example, \ac{IT} staff, domain scientists and technicians working alongside software engineers. The shared values come into play and each RSE needs to ensure that these values are lived by and passed on to others. @@ -725,8 +741,9 @@ Code review also includes aspects of the teaching skill. \skillsection{PM} -The RSE should have knowledge about project management. At some institutes, it follows the practices of the local research groups, -but it is useful if an RSE knows its place in a \gls{PM} scheme, or can bring in new ideas for improvement. +The RSE should have knowledge of project management processes. At some +institutes, project management tools and approaches differ between individual research groups, +but it is useful if an RSE understands general structures of a \gls{PM} scheme, or can bring in new ideas for improvement. Project management in research software engineering poses specific challenges (see \gls{USERS}) that might require the capacity to flexibly adapt to changing conditions and deviate from common project management methods. Additionally, the RSE should know that SE offers various methods and approaches From 8b5bb39b9e9c5ebf75e6790b3e30131b1f37e54f Mon Sep 17 00:00:00 2001 From: Jeremy Cohen Date: Thu, 8 Aug 2024 21:14:36 +0100 Subject: [PATCH 02/10] Minor edits for sections 4.4-6. --- competencies.md | 71 ++++++++++++++++++++++++++----------------------- glossary.tex | 1 + 2 files changed, 39 insertions(+), 33 deletions(-) diff --git a/competencies.md b/competencies.md index c47624bd..c8065a87 100644 --- a/competencies.md +++ b/competencies.md @@ -769,13 +769,14 @@ These skills, while already numerous are also generic on purpose. They span a multidimensional space in which the day-to-day tasks and responsibilities of an RSE can be found. We describe here some examples of the competencies applied in combination to -the set of current common tasks and challenges for RSEs identified during the Paderborn workshop. +the set of current common tasks and challenges for RSEs identified during the +deRSE23 Paderborn workshop. The most obvious task of an RSE is to develop software that is used in research. This broad topic requires all the \ac{SE} skills. Of course, these are the competencies that are the most fluid since they have to adapt to frequent technological advancements. Additionally, proper \ac{SE} skills often require -knowledge of \gls{TEAM}, and \gls{PM}. Today this means effective use of +knowledge of \gls{TEAM}, and \gls{PM}. Today, this means effective use of \acp{IDE}, \gls{static-analysis} tools, \glspl{design-pattern} and documentation (for oneself and others). @@ -802,7 +803,7 @@ Software development also includes testing. This task is a manifestation of the is required in order to write good tests that facilitate understanding and documentation. Today this encompasses the knowledge of testing frameworks as well as \ac{CICD} practices. In addition to being -tested, software should also be reproducible. Projects like ReproHack +tested, software should also provide reproducible outputs. Projects like ReproHack [@ReproHack] can greatly help in fostering that competency. Apart from testing, there are many code analysis tools to monitor and improve the quality of code. @@ -840,9 +841,9 @@ Most RSEs will contribute to other projects, some of which will be large. This is a topic that requires competency in \gls{SWREPOS}, \gls{SRU}, and \gls{SP} in order to understand the ramifications of sharing, and in \gls{DOCBB}, since the contributed code has to be understood by others. Interacting with project -members depends on the \gls{TEAM} skill. Today this entails the effective use +members depends on the \gls{TEAM} skill. Today, this frequently involves the effective use of collaborative platforms like \gls{GitHub}/\gls{GitLab}, honouring a -project's code of conduct, and some knowledge of popular software licences, +project's code of conduct, and some knowledge of popular open source software licences, e.g. the \ac{GPL}. RSEs are embedded in communities. There are two different aspects to finding @@ -851,24 +852,26 @@ research project. Since this deals with software that is supposed to be used in research this requires knowledge of \gls{RC}, \gls{USERS}, and also \gls{NEW}, in order to effectively interact with domain scientists. Today, an example is a presence on social media. The other \gls{TEAM}-related aspect is the embedding -of RSE graduates into the community of RSEs, sharing the same set of values and -competencies. We envision our RSE graduates to be a part of a strong network of +of recently-trained RSEs into the Research Software Engineering community, +sharing the same set of values and competencies. +We envision newcomers to the RSE field becoming part of a strong network of RSEs, tool-related communities, and the classical domain communities, making them more effective at supporting research. These networks are a lifelong manifestation where RSEs work to provide an inclusive environment for their -peers and provide opportunities for life long learning. +peers and provide opportunities for lifelong learning. RSEs are also mentoring colleagues (see also @sec:mentoring-and-diversity). This necessitates giving good advice that fits to a project's stage in its life cycle, thereby requiring knowledge of (\gls{SWLC}), and its context in its research domain and thus (\gls{RC}). -Research software often starts out as a tool to answer a personal research -question and becomes more important when other researchers rely on it. Some -research software might even be used to deal with critical questions such as +Research software can often start out as a tool to answer a personal research +question, becoming more important when other researchers start to rely on it. +At the other end of the scale, research software can sometimes underpin key +processes that deal with critical questions such as weather forecasting or medical diagnosis. A classification of software is commonly used to formalise the process of giving good advice [@Wang2012; @Schlauch2018b] where research software can move from one class to another -during its life cycle. [@Schlauch2018b] classify applications based on their scope -and criticality and provide \ac{SE} recommendations. The RSE needs to be able to +during its life cycle. [@Schlauch2018b] classifies applications based on their scope +and criticality and provides \ac{SE} recommendations. The RSE needs to be able to identify the application class they are dealing with and apply the respective RSE practices. @@ -877,9 +880,9 @@ and services with different variants for different research purposes and groups. Additionally, many research groups develop their own codes for specific research purposes, e.g. simulation codes or specialised data analysis pipelines. A lot of their development of new features is project-based, -often through PhD projects, -which grow into their own variant with re-integration planned as a final step. -To reduce the chance of variant source code diverging and producing +often through PhD projects. Work can sometimes result in code that diverges from +the main project into a separate variant with re-integration planned as a final step. +To reduce the chance of variant source code diverging significantly and producing a large integration overhead, \gls{PM} skills and methods are needed. More specifically, software product line management methods have been developed for this exact problem and purpose. @@ -897,7 +900,7 @@ To elaborate on that, we have prepared the following table with three levels of - Junior RSE: These are people who are in the earlier stages of their RSE career journey, but they should ideally have research experience of their own as well as the skills to contribute reliable and well-structured code to software projects. - Senior RSE: They have gained experience, both concerning their software skills as well as in their research collaborations in potentially many different fields. They can set the standards in a software project. -- Principal RSE: Their actual job description varies a lot. These may be RSE team leaders based in a professional services type role, or they may be professors or research group leaders based in a more academic-focused role. They are often the people responsible for bringing in the funding that supports new and sustains existing projects. Generally speaking, they do not need to be actively involved in the day-to-day technical tasks, but they should be able to guide projects from both a technical and a research perspective while providing an inclusive working space and in many cases, lead research or RSE teams. +- Principal RSE: Their actual job description varies a lot. These may be RSE team leaders based in a professional services type role, or they may be professors or research group leaders based in a more academic-focused role. They are often the people responsible for bringing in the funding that supports new and sustains existing projects. Generally speaking, they do not need to be actively involved in the day-to-day technical tasks, but they should be able to guide projects from both a technical and a research perspective while providing an inclusive working space. @tbl:comp-lvls-techn, @tbl:comp-lvls-res, and @tbl:comp-lvls-comm elaborate on the required facets of the competencies in different roles. A story-like example of an individual through the hierarchies can be found in @subsec:examplecareer . @@ -909,9 +912,9 @@ A story-like example of an individual through the hierarchies can be found in @s | Competency | Junior RSE | Senior RSE | Principal RSE | | --- | ---------- | ---------- | ---------- | | \gls{SWLC} | Should be aware of the software life cycle. | Should know where in the life cycle their project is and which decisions are likely to lead to technical debt. | Should know how to manage and steer development/project resources accordingly. Should also have an understanding of the potential consequences of key project management decisions. | -| \gls{DOCBB} | Should be able to write reusable building blocks. | Same as junior, but the quality should set the standard for the project, while following current best practices. | Should know the current best practices and point their staff to the right resources. | +| \gls{DOCBB} | Should be able to write reusable building blocks. | Same as junior, but the quality should set the standard for the project, while following current best practices. | Should know the current best practices and point their team members and collaborators to the right resources. | | \gls{DIST} | Should be able to use package distribution platforms. | Same as junior, but should also be familiar with current best practices for building and deploying packages. | Should ensure that their project is available via an up-to-date and secure distribution platform. | -| \gls{SWREPOS} | Should seamlessly interact with the repository of their project. | Should be well-versed in the intricacies of a repository, and probably interact with repositories of multiple projects. | Should promote the use of repositories and be able to convey best practices of sharing and reviewing to junior and senior RSEs. | +| \gls{SWREPOS} | Should seamlessly interact with the repository of their project. | Should be well-versed in the intricacies and best practices around working with a repository, and probably interact with repositories of multiple projects. | Should promote the use of repositories and be able to convey best practices of sharing and reviewing to junior and senior RSEs. | | \gls{MOD} | Should have a basic grasp of their piece of the software in order to use basic tools such as a debugger. | Should understand the characteristics of large parts of the codebase considering a variety of the metrics. | Should understand the big idea of the software project in order to define the task that the software solves. | Table: Levels of technical skills expected per RSE career stage. {#tbl:comp-lvls-techn} @@ -923,14 +926,14 @@ Table: Levels of technical skills expected per RSE career stage. {#tbl:comp-lvls | \gls{NEW} | Should have some curiosity to fit into research teams. | Same as junior, but they should proactively propose directions in individual aspects of the project. | Should have research insights and a broad view of the research field to steer the project. | | \gls{RC} | Should be aware of the research life cycle. | Should know the position of the project in the research life cycle. | Should know what is necessary for the project to fit into its position in the research life cycle. | | \gls{SRU} | Should be aware of software reusability tools. | Should be able to search with software reusability tools. | Should be able to effectively search with \gls{SRU} tools and to evaluate and perform the integration of a library into the project. | -| \gls{SP} | Should be aware that software publication needs to consider issues of intellectual property. | Should be able to correctly publish software in simple cases and to identify cases where professional legal advice is needed. | Same as senior, plus the ability to take the future publication of software into account when initiating and guiding larger software collaboration projects. | +| \gls{SP} | Should be aware of available opportunities to publish software and understand the need to consider issues of intellectual property. | Should be able to correctly publish software in simple cases and to identify cases where professional legal advice is needed. | Same as senior, plus the ability to take the future publication of software into account when initiating and guiding larger software collaboration projects. | | \gls{DOMREP} | Should be able to interact with the domain repository. | Same as junior RSE. | Same as junior, and should know about how it fits into workflows surrounding these domain repositories. | Table: Levels of research skills expected per RSE career stage. {#tbl:comp-lvls-res} | Competency | Junior RSE | Senior RSE | Principal RSE | | --- | ---------- | ---------- | ---------- | -| \gls{TEAM} | Should be able to work in the team in order to effectively fulfil the given tasks. Should be able to learn from code review. | Should be able to break down tasks into more easily digestible sub-tasks. | Should be able to lead the team and set the respective direction. | +| \gls{TEAM} | Should be able to work in the team in order to effectively fulfil the given tasks. Should be able to learn from code review. | Should be able to break down tasks into more easily digestible sub-tasks and review or guide work undertaken by less-experience team members. | Should be able to lead the team and set the respective direction. | | \gls{TEACH} | Should be able to perform simple peer-to-peer on-boarding tasks. | Should be able to explain logical components and the general architecture to other RSEs. | Should be able to effectively communicate about all high-level parts of the project. | | \gls{PM} | Should be aware of the employed \gls{PM} method. | Should be able to use and adapt the employed \gls{PM} method. | Should be able to design and adapt the employed \gls{PM} method. | | \gls{USERS} | Should be able to communicate with both users and SEs on the project, on topics of the research and SE. | Same as junior RSE, and be able to interpret the feedback. | Same as senior, and should also be able to effectively take feedback into account when steering the project. | @@ -1014,7 +1017,7 @@ They should encourage their students to use relevant tools (\gls{DIST}). They need to be able to judge the suitability of the software (\gls{SWLC}) and follow the interactions between relevant projects (\gls{SWREPOS}). They should be able to advise their students on the legal aspects of software production and distribution (\gls{SP}). They should be able to contribute meaningfully to the steering decisions of the software in their field (\gls{USERS}). -They are able to guide students and give full-size lectures (\gls{TEACH}). +They are able to guide students and preare and deliver full-length lectures (\gls{TEACH}). They need to manage and lead their research group (\gls{PM}, \gls{TEAM}). ## Project team structures @@ -1032,7 +1035,7 @@ We first summarise the meaning of each of the columns in the tables: - **Competency:** The code assigned to the competency being considered, as defined in @sec:required-generic-skills, e.g. \gls{TEAM}. - **Individual RSE (Locally-based):** A single person working on some - research software - often an RSE with focus on their own research. Often time-constrained, + research software - often primarily a researcher, but learning and applying RSE skills with focus on their own research. Often time-constrained, may be self-taught. - **Individual RSE (RSE team-based):** A single person working on research software - generally a professional RSE assigned to support another team's software on their own, @@ -1080,13 +1083,13 @@ The current situation may differ. \gls{DIST}& Does not emphasise code reusability and sharing/distribution.& Puts greater focus on reusability/sharing, but likely not as part of the project aims.& - May want to develop reusable shareable outputs for a specific case. Needs clear guidelines.& + May want to develop reusable, shareable outputs for a specific case. Needs clear guidelines.& Focuses on quality and best practices. Reusability/packaging driven by project needs and spec.& Should provide policies on reusability/sharing. May be driven by requirements/policies, e.g., of institution or funding agency. \\\hline %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \gls{SWLC}& - Manages the complete life cycle, bus factor equal to 1.& + Manages the complete life cycle, \glspl{bus-factor} equal to 1.& The team supports parts of the software life cycle, but with low bus factor.& The team infrastructure and tooling supports the life cycle and sustainability.& The bus factor may still be low in parts of the code. @@ -1255,7 +1258,7 @@ to and handled by researchers and RSEs working in different environments within an organisation and how the organisations themselves can contribute. We recognise that this is a challenging area to gain a detailed view of and that this is still a significant generalisation. -We talk about the "research software engineer" as a single entity but as the field expands, +We talk about the "Research Software Engineer" as a single entity but as the field expands, we expect to see more roles and job titles emerging around the RSE concept, many of which fit under the wider umbrella of research technology professionals (RTPs) [@ukri_rtp] [@techniciancommitment]. Examples are different RSE-like computational roles of @@ -1378,7 +1381,9 @@ The development of research software based on \ac{ML} requires specialised theor This involves knowledge about data analysis and feature engineering, metrics that are involved in \ac{ML}, \ac{ML} algorithm selection and cross validation, and knowledge in mathematical optimisation methods and statistics. Here, we use \ac{ML} in a broad sense of machine-based learning including deep learning, reinforcement learning, neuro-symbolic learning and similar. -ML-RSEs analyse and check the suitability of an algorithm if it fulfils the needs of a certain task and they play a main role in deciding and selecting \ac{ML} libraries for a given task. +ML-RSEs analyse and check the suitability of an algorithm. They check if it +fulfils the needs of a certain task and they play a central role in deciding on +and selecting \ac{ML} libraries for a given task. The increasing usage of \ac{ML} in numerous scientific areas with social impact involves an emphasised awareness and consideration of possible influences and biases. At the intersection of data science [@SSIDataScience] and data-focused RSEs, the complex way of solving problems utilising \ac{ML} calls for this separate specialisation. @@ -1393,16 +1398,16 @@ Adaption of existing, large-scale codebases to evolving dependencies (\gls{DIST} may require mastery in refactoring techniques and in the usage of specialised code transformation tools. #### Web-development RSE -This RSE is skilled in web applications, front- and/or backend, and/or building -and using APIs, for example for research data portals or big research projects. -A deep knowledge of web development skills is a required additional skill for this RSE. +This RSE is skilled in the development of web applications and/or mobile apps. +They have expertise in one or more of frontend development, backend development +and the design or implementation of APIs, for example to support research data portals or big research projects. Since a lot of web services for research may be accessible to a large audience or even to the public, this RSE is also familiar with aspects relating to cybersecurity, usability and accessibility. Not only do they need to balance these concerns while adhering to their values from @sec:values, but they also need to efficiently communicate the decisions made to stakeholders. #### Legal-RSE -All RSEs are a go-to person for questions about licensing, in particular when mixing software components that use different licences. +RSEs are often the go-to person for questions about software licensing, in particular when mixing software components that use different licences. But with the rising requirements from legislation, we foresee the need for RSEs that still have a background in RSE but extend it with a knowledge of legal processes that cover corner cases and go beyond applying Best Practice guides. These requirements may arise in the area of publication of research software, @@ -1410,7 +1415,7 @@ as this also requires knowledge about particular laws or regulatory frameworks c like the \ac{GDPR} within the \ac{EU} [@GDPR]. Another area are legal aspects of cybersecurity and export control in science and research (see [@ExportControl] for Germany). Legal-RSEs focus on facilitating the achievement of technically feasible solutions, while adhering to regulatory mandates. -They are able to communicate and collaborate with lawyers. +They are able to communicate and collaborate effectively with lawyers. While this community paper is based on workshop discussions that were attended largely by Research Software Engineers (deRSE23 in Paderborn, un-deRSE23 in -Jena and deRSE24 in Würzburg, Germany), +Jena, and deRSE24 in Würzburg, Germany), we believe that the competencies formulated here can offer far-reaching impact beyond the domain of RSE into adjacent aspects of research such as research data. From 5747d185831ee1ecf5220d48b08e449ebebd5f56 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Frank=20L=C3=B6ffler?= Date: Fri, 9 Aug 2024 13:50:14 +0200 Subject: [PATCH 05/10] Update competencies.md Co-authored-by: Gerasimos Chourdakis --- competencies.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/competencies.md b/competencies.md index 059a4d1c..1de7a39a 100644 --- a/competencies.md +++ b/competencies.md @@ -188,7 +188,7 @@ While this community paper is based on workshop discussions that were attended largely by Research Software Engineers (deRSE23 in Paderborn, un-deRSE23 in Jena, and deRSE24 in Würzburg, Germany), we believe that the competencies formulated here can offer far-reaching -impact beyond the domain of RSE into adjacent aspects of research such as +impact beyond the domain of RSE into adjacent aspects of research, such as research data. For example, this is especially important given that much research involves some amount of data management, processing and visualisation, and funding bodies and computing From 799161acb8904fe8cfad485507a356b011dd9067 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Frank=20L=C3=B6ffler?= Date: Fri, 9 Aug 2024 13:52:29 +0200 Subject: [PATCH 06/10] Do not call this an example Co-authored-by: Gerasimos Chourdakis --- competencies.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/competencies.md b/competencies.md index 1de7a39a..c25065eb 100644 --- a/competencies.md +++ b/competencies.md @@ -190,7 +190,7 @@ Jena, and deRSE24 in Würzburg, Germany), we believe that the competencies formulated here can offer far-reaching impact beyond the domain of RSE into adjacent aspects of research, such as research data. -For example, this is especially important given that much research involves some amount of +This is especially important given that much research involves some amount of data management, processing and visualisation, and funding bodies and computing infrastructure providers will sometimes prioritise projects that generate archived, annotated, re-usable, and potentially remotely executable data. From faaa7b99333b81332fee34f595110b1015dcbdbf Mon Sep 17 00:00:00 2001 From: Jeremy Cohen Date: Fri, 9 Aug 2024 19:10:30 +0100 Subject: [PATCH 07/10] Apply suggestions from review of review! MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Frank Löffler Co-authored-by: Jan Philipp Thiele <54978337+jpthiele@users.noreply.github.com> --- competencies.md | 34 ++++++++++++++++++---------------- glossary.tex | 2 +- 2 files changed, 19 insertions(+), 17 deletions(-) diff --git a/competencies.md b/competencies.md index 32e71b47..fa5aad12 100644 --- a/competencies.md +++ b/competencies.md @@ -124,7 +124,7 @@ They are now vital elements of the research process across almost all domains. They enable researchers to collect and process ever-increasing amounts of data, simulate a wide range of physical phenomena across previously unexplored scales of the universe, and discover previously inconceivably complex structures in nature and societies via \ac{ML}. -This prevalence of computation in research means that +This prevalence of computation and digitally-aided data analysis in research means that digital skills are now required by researchers at all career levels, and in fields significantly beyond those that would previously have been expected. @@ -185,8 +185,8 @@ would have chosen different competencies, we will argue that this trifecta of sk is exactly what is needed for modern, software-enabled research. --> While this community paper is based on workshop discussions that were attended -largely by Research Software Engineers (deRSE23 in Paderborn, un-deRSE23 in -Jena, and deRSE24 in Würzburg, Germany), +largely by RSEs (deRSE23 in Paderborn, un-deRSE23 in +Jena, and deRSE24 in Würzburg, all in Germany), we believe that the competencies formulated here can offer far-reaching impact beyond the domain of RSE into adjacent aspects of research, such as research data. @@ -339,9 +339,11 @@ the \acrfull{HIFIS} [@HIFIS], and the already mentioned SURESOFT [@SURESOFTLin # Values {#sec:values} -In addition to the values for good scientific practice [@dfg_gsp], RSEs should also adhere to -the \ac{SE} Code of Ethics [@Gotterbarn1999] and their activities should be guided by ethical values. -Central to the \ac{SE} Code of Ethics is the RSE's obligation to +It is important that the activities of an RSE are guided by ethical values. +In addition to the values for good scientific practice [@dfg_gsp], RSEs also adhere to +the \ac{SE} Code of Ethics [@Gotterbarn1999]. +Central to that code is the RSE's obligation to +In addition to the values for good scientific practice commit to the health, safety and welfare of the public and act in the interest of society, their employer and their clients. Further values loosely based on that code include the obligations @@ -361,7 +363,7 @@ The epistemological status of computer modelling and simulation is still the sub which ranges from the postulate of a new process of knowledge creation that has its own, unique, epistemology [@winsberg_sanctioning_1999] to the perception that from a philosophy of science perspective, there is nothing really new [@frigg_philosophy_2009]. In any case, it is clear that a number of decisions in the construction of a simulation-model -will have a decisive impact on the adequacy for purpose [@bokulich_data_2021] of the model. +will have a significant impact on the adequacy for purpose [@bokulich_data_2021] of the model. These decisions include the selection of the salient characteristics of the system to be modelled, the choice of the mathematical representation of the processes to be represented, the choice of numerical methods and other algorithms @@ -382,10 +384,10 @@ It is important that RSEs are aware of this responsibility and continuously impr Research software is also well on its way to being ever-present in data-driven research, in all research fields. This can probably be most prominently seen by considering software used to analyse data, e.g. within experimental research. It is not unusual for RSEs to support those more research data oriented efforts as well. -Here, specifically, they closely interact with research data management -professionals and practices by designing research software that is better able to -adhere to the \ac{FAIR} principles for research data, but also to follow -similar rules for research software (FAIR4RS [@FAIR4RS]). +Here, specifically, they closely interact with research data management professionals +and practices by designing research software +that is better able to adhere to the \ac{FAIR} principles for research data, +but also to follow similar rules for research software (FAIR4RS [@FAIR4RS]). As such, they are then familiar with special requirements stemming from the field itself, e.g., in medical research, and with privacy related issues especially for personal data, e.g., for conducting surveys. RSEs often assume a multifaceted role at the junction of research, \ac{SE} and data management. @@ -409,7 +411,7 @@ RSEs need to be aware of this topic's importance and deal with tensions that might arise with researchers' desire for trouble-free sharing of data, thereby expecting openness about the research process, versus the integrity expectations of the society towards \ac{IT} systems. Handling personal data also has ramifications for information security considerations during the software development process. -Data protection is a complex topic, so RSEs should be aware of when they may need to consult external expertise, for example when dealing with +Data protection is a complex topic, so RSEs should be aware that they may need to consult external expertise, for example when dealing with special topics such as cryptography or re-identification attacks [@Henriksen2016]. @@ -544,7 +546,7 @@ They can formulate and implement various types of tests, they can independently and automate operations of the integration and release process. They can provide working, scalable, and future-proof solutions in a professional context and with common project and software management techniques, adapted to the needs of the research environment. -Finally, as people who have often gained significant experience of research +Finally, as people who have often gained significant research experience in a particular discipline, they combine the necessary foundations from their domain with software engineering skills to develop complex software. @@ -936,7 +938,7 @@ Table: Levels of research skills expected per RSE career stage. {#tbl:comp-lvls- | Competency | Junior RSE | Senior RSE | Principal RSE | | --- | ---------- | ---------- | ---------- | -| \gls{TEAM} | Should be able to work in the team in order to effectively fulfil the given tasks. Should be able to learn from code review. | Should be able to break down tasks into more easily digestible sub-tasks and review or guide work undertaken by less-experience team members. | Should be able to lead the team and set the respective direction. | +| \gls{TEAM} | Should be able to work in the team in order to effectively fulfil the given tasks. Should be able to learn from code review. | Should be able to break down tasks into more easily digestible sub-tasks and review or guide work undertaken by less-experienced team members. | Should be able to lead the team and set the respective direction. | | \gls{TEACH} | Should be able to perform simple peer-to-peer on-boarding tasks. | Should be able to explain logical components and the general architecture to other RSEs. | Should be able to effectively communicate about all high-level parts of the project. | | \gls{PM} | Should be aware of the employed \gls{PM} method. | Should be able to use and adapt the employed \gls{PM} method. | Should be able to design and adapt the employed \gls{PM} method. | | \gls{USERS} | Should be able to communicate with both users and SEs on the project, on topics of the research and SE. | Same as junior RSE, and be able to interpret the feedback. | Same as senior, and should also be able to effectively take feedback into account when steering the project. | @@ -1020,7 +1022,7 @@ They should encourage their students to use relevant tools (\gls{DIST}). They need to be able to judge the suitability of the software (\gls{SWLC}) and follow the interactions between relevant projects (\gls{SWREPOS}). They should be able to advise their students on the legal aspects of software production and distribution (\gls{SP}). They should be able to contribute meaningfully to the steering decisions of the software in their field (\gls{USERS}). -They are able to guide students and prepare and deliver full-length lectures (\gls{TEACH}). +They are able to guide students and prepare and deliver a full lecture course (\gls{TEACH}). They need to manage and lead their research group (\gls{PM}, \gls{TEAM}). ## Project team structures @@ -1038,7 +1040,7 @@ We first summarise the meaning of each of the columns in the tables: - **Competency:** The code assigned to the competency being considered, as defined in @sec:required-generic-skills, e.g. \gls{TEAM}. - **Individual RSE (Locally-based):** A single person working on some - research software - often primarily a researcher, but learning and applying RSE skills with focus on their own research. Often time-constrained, +research software - often an RSE with focus on their own research. Often time-constrained, may be self-taught. - **Individual RSE (RSE team-based):** A single person working on research software - generally a professional RSE assigned to support another team's software on their own, diff --git a/glossary.tex b/glossary.tex index 44364f44..33ee2f0e 100644 --- a/glossary.tex +++ b/glossary.tex @@ -16,7 +16,7 @@ % software engineering \newglossaryentry{design-pattern}{name={design pattern},description={general and reusable solution to solve a \ac{SE} problem (often a best practice, or a ``recipe'')}} \newglossaryentry{static-analysis}{name={static analysis},description={automated procedure to detect software bugs in source code without executing the code}} -\newglossaryentry{bus-factor}{name={bus factor},description={vulnerability of a project to losing team members - a bus factor of 1 means that only a single team member knows the details of the project and represents a high risk}} +\newglossaryentry{bus-factor}{name={bus factor},description={vulnerability of a project to losing team members - a bus factor of 1 means that only a single team member knows the details of the project, which represents a high risk}} % other \newglossaryentry{SysOp}{name={SysOp},description={system administrator in charge of a computing infrastructure}} From d24f63dd567664e5bf9526cf2d941d9fd48c50da Mon Sep 17 00:00:00 2001 From: Jeremy Cohen Date: Fri, 9 Aug 2024 19:12:41 +0100 Subject: [PATCH 08/10] Update from review --- competencies.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/competencies.md b/competencies.md index fa5aad12..75489d6c 100644 --- a/competencies.md +++ b/competencies.md @@ -516,8 +516,7 @@ competencies defined below. We will elaborate these in \newcommand{\skillsection}[1]{\hypertarget{skills-#1}{% \subsubsection{\glsentrydesc{#1} (\texorpdfstring{\glsentrytext{#1}}{#1})}\label{skills-#1}}} -RSEs are highly-skilled software engineers, in addition to having an -understanding of the research environment. +Besides skilled researchers, RSEs are also competent software engineers. As such, they ideally can solve complex software engineering problems and design software as a user-oriented, future-proof product. The technical skills required by an RSE overlap to a large extent with the common fundamental software engineering skills (see, e.g., @Landwehr2017), From 1bac4f7a838c4313d3928b79c9a123f3995d4e32 Mon Sep 17 00:00:00 2001 From: Jeremy Cohen Date: Fri, 9 Aug 2024 19:14:28 +0100 Subject: [PATCH 09/10] Update bus factor in glossary --- glossary.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/glossary.tex b/glossary.tex index 33ee2f0e..94a4ac5e 100644 --- a/glossary.tex +++ b/glossary.tex @@ -16,7 +16,7 @@ % software engineering \newglossaryentry{design-pattern}{name={design pattern},description={general and reusable solution to solve a \ac{SE} problem (often a best practice, or a ``recipe'')}} \newglossaryentry{static-analysis}{name={static analysis},description={automated procedure to detect software bugs in source code without executing the code}} -\newglossaryentry{bus-factor}{name={bus factor},description={vulnerability of a project to losing team members - a bus factor of 1 means that only a single team member knows the details of the project, which represents a high risk}} +\newglossaryentry{bus-factor}{name={bus factor},description={vulnerability of a project to losing key and irreplaceable team members - a bus factor of 1 means that a single such team member vanishing would already stall the project's advancement.}}``` % other \newglossaryentry{SysOp}{name={SysOp},description={system administrator in charge of a computing infrastructure}} From 2f2e312817be81c558c6e9608066e45b7d10d96c Mon Sep 17 00:00:00 2001 From: Jeremy Cohen Date: Fri, 9 Aug 2024 19:20:48 +0100 Subject: [PATCH 10/10] Fix glossary entry. --- glossary.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/glossary.tex b/glossary.tex index 94a4ac5e..52cb58bd 100644 --- a/glossary.tex +++ b/glossary.tex @@ -16,7 +16,7 @@ % software engineering \newglossaryentry{design-pattern}{name={design pattern},description={general and reusable solution to solve a \ac{SE} problem (often a best practice, or a ``recipe'')}} \newglossaryentry{static-analysis}{name={static analysis},description={automated procedure to detect software bugs in source code without executing the code}} -\newglossaryentry{bus-factor}{name={bus factor},description={vulnerability of a project to losing key and irreplaceable team members - a bus factor of 1 means that a single such team member vanishing would already stall the project's advancement.}}``` +\newglossaryentry{bus-factor}{name={bus factor},description={vulnerability of a project to losing key and irreplaceable team members - a bus factor of 1 means that a single such team member vanishing would already stall the project's advancement.}} % other \newglossaryentry{SysOp}{name={SysOp},description={system administrator in charge of a computing infrastructure}}