diff --git a/jsoc/allprojects.md b/jsoc/allprojects.md index ceab1af84b..6b040342af 100644 --- a/jsoc/allprojects.md +++ b/jsoc/allprojects.md @@ -1,7 +1,5 @@ -## View all GSoC/JSoC Projects +# View all GSoC/JSoC Projects This page is designed to improve discoverability of projects. You can, for example, search this page for specific keywords and find all of the relevant projects. -## Projects - {{ all_gsoc_projects }} diff --git a/jsoc/gsoc/MLJ.md b/jsoc/gsoc/MLJ.md index 86664bf075..2319a62c1d 100644 --- a/jsoc/gsoc/MLJ.md +++ b/jsoc/gsoc/MLJ.md @@ -1,13 +1,12 @@ -# MLJ.jl Projects – Summer of Code +@def mintoclevel = 2 + +# MLJ.jl Projects - Summer of Code [MLJ](https://github.com/alan-turing-institute/MLJ.jl) is a machine learning framework for Julia aiming to provide a convenient way to use and combine a multitude of tools and models available in the Julia ML/Stats ecosystem. - -### List of projects - MLJ is released under the MIT license. \toc diff --git a/jsoc/gsoc/compiler.md b/jsoc/gsoc/compiler.md index 4c508cb21b..d3d127ede5 100644 --- a/jsoc/gsoc/compiler.md +++ b/jsoc/gsoc/compiler.md @@ -73,20 +73,3 @@ BoF calendar invite][threadcall] on the Julia Language Public Events calendar. **Recommended Skills**: Varies by project, but generally some multi-threading and C experience is needed\ **Contact:** [Jameson Nash](https://github.com/vtjnash) - - -## Automation of testing / performance benchmarking (350 hours) - -The Nanosoldier.jl project (and related ) tests for -performance impacts of some changes. However, there remains many areas that are not covered (such as -compile time) while other areas are over-covered (greatly increasing the duration of the test for no -benefit) and some tests may not be configured appropriately for statistical power. Furthermore, the -current reports are very primitive and can only do a basic pair-wise comparison, while graphs and -other interactive tooling would be more valuable. Thus, there would be many great projects for a -summer contributor to tackle here! - -**Expected Outcomes**: Improvement of Julia's automated testing/benchmarking framework. -**Skills**: Interest in and/or experience with CI systems. -**Difficulty**: Medium - -**Contact:** [Jameson Nash](https://github.com/vtjnash), [Tim Besard](https://github.com/maleadt) diff --git a/jsoc/gsoc/contractionorder.md b/jsoc/gsoc/contractionorder.md index 1a5dfed9f2..30fbdc1e7b 100644 --- a/jsoc/gsoc/contractionorder.md +++ b/jsoc/gsoc/contractionorder.md @@ -1,4 +1,4 @@ -# Tensor network contraction order optimization and visualization +# Tensor network contraction order optimization and visualization – Summer of Code [OMEinsum.jl](https://github.com/under-Peter/OMEinsum.jl) is a pure Julia package for tensor network computation, which has been used in various projects, including diff --git a/jsoc/gsoc/documenter.md b/jsoc/gsoc/documenter.md index 7fadf6418c..a88e8070b0 100644 --- a/jsoc/gsoc/documenter.md +++ b/jsoc/gsoc/documenter.md @@ -1,4 +1,4 @@ -# Documentation tooling +# Documentation tooling – Summer of Code ## Documenter.jl diff --git a/jsoc/gsoc/gpu.md b/jsoc/gsoc/gpu.md new file mode 100644 index 0000000000..baaaaa23f9 --- /dev/null +++ b/jsoc/gsoc/gpu.md @@ -0,0 +1,28 @@ +# GPU Projects - Summer of Code + +[JuliaGPU](https://github.com/JuliaGPU) provides a suite of packages for programming GPUs in Julia. We have support for AMD, NVIDIA and Intel GPUs through various backends, unified by high-level array abstractions and a common programming model based on kernel programming. + +## Improving GPU Stack Portability + +**Difficulty:** Medium + +**Duration:** 175 or 350 hours (the scope of functionality to port can be adjusted accordingly) + +**Description:** The Julia GPU stack consists of several layers, from low-level vendor-specific packages like CUDA.jl to high-level abstractions like GPUArrays.jl. While the high-level packages aim to be vendor-agnostic, many optimized operations are still implemented in vendor-specific ways. This project aims to improve portability by moving these implementations to GPUArrays.jl using KernelAbstractions.jl. + +The project will involve: +- Identifying vendor-specific kernel implementations in packages like CUDA.jl +- Porting these kernels to KernelAbstractions.jl in GPUArrays.jl +- Improving KernelAbstractions.jl where needed to support these kernels +- Ensuring performance remains competitive with vendor-specific implementations +- Adding tests to verify correctness across different GPU backends + +**Required Skills:** +- Experience with Julia programming +- Familiarity with GPU programming concepts +- Experience with GPU programming in Julia is a plus +- Understanding of performance optimization + +**Expected Results:** A set of optimized GPU kernels in GPUArrays.jl that are vendor-agnostic and performant across different GPU backends. This will improve the portability of the Julia GPU stack and make it easier to support new GPU architectures. + +**Mentors:** [Tim Besard](https://github.com/maleadt), [Valentin Churavy](https://github.com/vchuravy) diff --git a/jsoc/gsoc/graphics.md b/jsoc/gsoc/graphics.md index e186ec130b..30ba3bae14 100644 --- a/jsoc/gsoc/graphics.md +++ b/jsoc/gsoc/graphics.md @@ -3,7 +3,7 @@ Removed reference to this file on the main projects page for summer 2021 since they weren't updated. --> -# Graphic Projects – Summer of Code +# Graphic Projects – Summer of Code ## Makie diff --git a/jsoc/gsoc/juliaconstraints.md b/jsoc/gsoc/juliaconstraints.md index d828df7004..1a9733efb0 100644 --- a/jsoc/gsoc/juliaconstraints.md +++ b/jsoc/gsoc/juliaconstraints.md @@ -1,4 +1,4 @@ -# Constraint Programming in Julia +# Constraint Programming in Julia – Summer of Code [JuliaConstraints](https://juliaconstraints.github.io/) is an organization supporting packages for Constraint Programming in Julia. Although it is independent of it, it aims for a tight integration with JuMP.jl over time. For a detailed overview of basic Constraint Programming in Julia, please have a look at our video from JuliaCon 2021 [Put some constraints into your life with JuliaCon(straints)](https://youtu.be/G4siuvNMj0c). diff --git a/jsoc/gsoc/juliagenai.md b/jsoc/gsoc/juliagenai.md index 07ce3c1456..f4f868865f 100644 --- a/jsoc/gsoc/juliagenai.md +++ b/jsoc/gsoc/juliagenai.md @@ -1,4 +1,4 @@ -# JuliaGenAI Projects +# JuliaGenAI Projects – Summer of Code ![JuliaGenAI Logo](https://github.com/JuliaGenAI/juliagenai.org/blob/main/assets/logos/logo-256.png?raw=true) @@ -68,7 +68,7 @@ Julia stands out as a high-performance language that's essential yet underrepres **Project Goals and Deliverables:** 1. **Knowledge Base Expansion:** Grow the AIHelpMe.jl knowledge base to include comprehensive, up-to-date resources from critical Julia ecosystems such as the Julia documentation site, DataFrames, Makie, Plots/StatsPlots, the Tidier-verse, SciML, and more. See [Github Issue](https://github.com/svilupp/AIHelpMe.jl/issues/3) for more details. This expansion is crucial for enriching the context and accuracy of AI-generated responses related to Julia programming. - + 2. **Performance Tuning:** Achieve at least a 10% improvement in accuracy and relevance on a golden Q&A dataset, refining the AIHelpMe.jl Q&A pipeline for enhanced performance. ### Project 4: Enhancing Julia's AI Ecosystem with ColBERT v2 for Efficient Document Retrieval @@ -147,7 +147,7 @@ As a pivotal resource for the Julia community, the [Julia LLM Leaderboard](https ### Project 7: Counterfactuals for LLMs (*Model Explainability* and *Generative AI*) -**Project Overview:** This project aims to extend the functionality of [CounterfactualExplanations.jl](https://github.com/JuliaTrustworthyAI/CounterfactualExplanations.jl) to Large Language Models (LLMs). As a backbone for this, support for computing feature attributions for LLMs will also need to be implemented. The project will contribute to both [Taija](https://github.com/JuliaTrustworthyAI) and [JuliaGenAI](https://github.com/JuliaGenAI). +**Project Overview:** This project aims to extend the functionality of [CounterfactualExplanations.jl](https://github.com/JuliaTrustworthyAI/CounterfactualExplanations.jl) to Large Language Models (LLMs). As a backbone for this, support for computing feature attributions for LLMs will also need to be implemented. The project will contribute to both [Taija](https://github.com/JuliaTrustworthyAI) and [JuliaGenAI](https://github.com/JuliaGenAI). **Mentor:** [Jan Siml](https://github.com/svilupp) (JuliaGenAI) and [Patrick Altmeyer](https://github.com/pat-alt) (Taija) @@ -170,6 +170,6 @@ As a pivotal resource for the Julia community, the [Julia LLM Leaderboard](https We'd love to hear your ideas and discuss potential projects with you. -Probably the easiest way is to join our [JuliaLang Slack](https://julialang.org/slack/) and join the `#generative-ai` channel. +Probably the easiest way is to join our [JuliaLang Slack](https://julialang.org/slack/) and join the `#generative-ai` channel. You can also reach out to us on [Julia Zulip](https://julialang.zulipchat.com/#narrow/stream/423470-generative-ai) or post a GitHub Issue on our website [JuliaGenAI](https://github.com/JuliaGenAI/juliagenai.org). diff --git a/jsoc/gsoc/juliahealth.md b/jsoc/gsoc/juliahealth.md index bbd562da6f..79efd606c1 100644 --- a/jsoc/gsoc/juliahealth.md +++ b/jsoc/gsoc/juliahealth.md @@ -1,4 +1,4 @@ -# JuliaHealth Projects +# JuliaHealth Projects – Summer of Code JuliaHealth is an organization dedicated to improving healthcare by promoting open-source technologies and data standards. Our community is made up of researchers, data scientists, software developers, and healthcare professionals who are passionate about using technology to improve patient outcomes and promote data-driven decision-making. @@ -10,7 +10,7 @@ We believe that by working together and sharing our knowledge and expertise, we **Description:** The OMOP Common Data Model (OMOP CDM) is a widely used data standard that allows researchers to analyze large, heterogeneous healthcare datasets in a consistent and efficient manner. JuliaHealth has several packages that can interact with databases that adhere to the OMOP CDM (such as OMOPCDMCohortCreator.jl or OMOPCDMDatabaseConnector.jl). -For this project, we are looking for students interested in further developing the tooling in Julia to interact with OMOP CDM databases. +For this project, we are looking for students interested in further developing the tooling in Julia to interact with OMOP CDM databases. - **Mentor:** Jacob Zelko (aka TheCedarPrince) [email: jacobszelko@gmail.com] @@ -18,23 +18,23 @@ For this project, we are looking for students interested in further developing t - **Duration**: 350 hours -- **Suggested Skills and Background**: +- **Suggested Skills and Background**: - Experience with Julia - - Familiarity with some of the following Julia packages would be a strong asset: + - Familiarity with some of the following Julia packages would be a strong asset: - FunSQL.jl - - DataFrames.jl - - Distributed.jl - - OMOPCDMCohortCreator.jl - - OMOPCDMDatabaseConnector.jl + - DataFrames.jl + - Distributed.jl + - OMOPCDMCohortCreator.jl + - OMOPCDMDatabaseConnector.jl - OMOPCommonDataModel.jl - Comfort with the OMOP Common Data Model (or a willingness to learn!) -- **Potential Outcomes:** +- **Potential Outcomes:** Some potential project outcomes could be: - Expanding OMOPCDMCohortCreator.jl to enable users to add constraints to potential patient populations they want to create such as conditional date ranges for a given drug or disease diagnosis. -- Support parallelization of OMOPCDMCohortCreator.jl based queries when developing a patient population. +- Support parallelization of OMOPCDMCohortCreator.jl based queries when developing a patient population. - Develop and explore novel ways for how population filters within OMOPCDMCohortCreator.jl can be composed together for rapid analysis. In whatever functionality that gets developed for tools within JuliaHealth, it will also be expected for students to contribute to the existing package documentation to highlight how new features can be used. @@ -54,51 +54,51 @@ For this project, we are looking for students interested in developing PLP tooli - **Duration**: 350 hours -- **Suggested Skills and Background**: +- **Suggested Skills and Background**: - Experience with Julia - Exposure to machine learning concepts and ideas - - Familiarity with some of the following Julia packages would be a strong asset: - - DataFrames.jl - - OMOPCDMCohortCreator.jl - - MLJ.jl - - ModelingToolkit.jl + - Familiarity with some of the following Julia packages would be a strong asset: + - DataFrames.jl + - OMOPCDMCohortCreator.jl + - MLJ.jl + - ModelingToolkit.jl - Comfort with the OMOP Common Data Model (or a willingness to learn) -- **Outcomes:** +- **Outcomes:** This project will be very experimental and exploratory in nature. To constrain the expectations for this project, here is a possible approach students will follow while working on this project: - Review existing literature on approaches to PLP - Familiarize oneself with tools for machine learning and prediction within the Julia ecosystem - - Determine PLP research question to drive package development - - Develop PLP package utilizing JuliaHealth tools to work with an OMOP CDM database - - Test and validate PLP package for investigating the research question + - Determine PLP research question to drive package development + - Develop PLP package utilizing JuliaHealth tools to work with an OMOP CDM database + - Test and validate PLP package for investigating the research question - Document findings and draft JuliaCon talk In whatever functionality that gets developed for tools within JuliaHealth, it will also be expected for students to contribute to the existing package documentation to highlight how new features can be used. -For this project, it will be expected as part of the proposal to pursue drafting and giving a talk at JuliaCon. -Furthermore, although not required, publishing in the JuliaCon Proceedings will both be encouraged and supported by project mentors. +For this project, it will be expected as part of the proposal to pursue drafting and giving a talk at JuliaCon. +Furthermore, although not required, publishing in the JuliaCon Proceedings will both be encouraged and supported by project mentors. -Additionally, depending on the success of the package, there is a potential to run experiments on actual patient data to generate actual patient population insights based on a chosen research question. -This could possibly turn into a separate research paper, conference submission, or poster submission. -Whatever may occur in this situation will be supported by project mentors. +Additionally, depending on the success of the package, there is a potential to run experiments on actual patient data to generate actual patient population insights based on a chosen research question. +This could possibly turn into a separate research paper, conference submission, or poster submission. +Whatever may occur in this situation will be supported by project mentors. ## Medical Imaging Subecosystem Projects -[MedPipe3D.jl](https://github.com/JuliaHealth/MedPipe3D.jl) together with [MedEye3D.jl](https://github.com/JuliaHealth/MedEye3d.jl) [MedEval3D.jl](https://github.com/JuliaHealth/MedEval3D.jl) and currently in development [MedImage.jl](https://github.com/JuliaHealth/MedImage.jl) is a set of libraries created to provide essential tools for 3D medical imaging to the Julia language ecosystem. +[MedPipe3D.jl](https://github.com/JuliaHealth/MedPipe3D.jl) together with [MedEye3D.jl](https://github.com/JuliaHealth/MedEye3d.jl) [MedEval3D.jl](https://github.com/JuliaHealth/MedEval3D.jl) and currently in development [MedImage.jl](https://github.com/JuliaHealth/MedImage.jl) is a set of libraries created to provide essential tools for 3D medical imaging to the Julia language ecosystem. MedImage is a package for the standardization of loading medical imaging data, and for its basic processing that takes into consideration its spatial metadata. -MedEye3D is a package that supports the display of medical imaging data. -MedEval3D has implemented some highly performant algorithms for calculating metrics needed to asses the performance of 3d segmentation models. -MedPipe3D was created as a package that improves integration between other parts of the small ecosystem (MedEye3D, MedEval3D, and MedImage). +MedEye3D is a package that supports the display of medical imaging data. +MedEval3D has implemented some highly performant algorithms for calculating metrics needed to asses the performance of 3d segmentation models. +MedPipe3D was created as a package that improves integration between other parts of the small ecosystem (MedEye3D, MedEval3D, and MedImage). ### Project 3: Adding functionalities to medical imaging visualizations -**Description:** -MedEye3D is a package that supports the display of medical imaging data. It includes multiple functionalities specific to this use case like automatic windowing to display soft tissues, lungs, and other tissues. The display takes into account voxel spacing, support of overlaying display for multimodal imaging, and more. All with high performance powered by OpenGL and Rocket.jl. Still, a lot of further improvements are possible and are described in the Potential Outcomes section. +**Description:** +MedEye3D is a package that supports the display of medical imaging data. It includes multiple functionalities specific to this use case like automatic windowing to display soft tissues, lungs, and other tissues. The display takes into account voxel spacing, support of overlaying display for multimodal imaging, and more. All with high performance powered by OpenGL and Rocket.jl. Still, a lot of further improvements are possible and are described in the Potential Outcomes section. - **Mentor:** Jakub Mitura [email: jakub.mitura14@gmail.com] @@ -106,12 +106,12 @@ MedEye3D is a package that supports the display of medical imaging data. It incl - **Duration**: 350 hours -- **Suggested Skills and Background**: +- **Suggested Skills and Background**: - Experience with Julia - Basic familiarity with computer graphics preferably OpenGL example [link](https://www.opengl-tutorial.org/beginners-tutorials/) - Some experience with 3d volumetric data with spatial metadata (or a willingness to learn!) look into for example [link](https://simpleitk.readthedocs.io/en/master/fundamentalConcepts.html) -- **Potential Outcomes:** +- **Potential Outcomes:** Although MedEye3D already supports displaying medical images, there are still some functionalities that will be useful for the implementation of some more advanced algorithms, like supervoxel segmentation or image registration (and both of them are crucial for solving a lot of important problems in medical imaging). To achieve this this project's goal is to implement. 1) Developing support for multiple image viewing with indicators for image registration like display of the borders, and display lines connecting points. 2) Automatic correct windowing for MRI and PET. @@ -130,7 +130,7 @@ Although MedEye3D already supports displaying medical images, there are still so ### Project 4: Adding dataset-wide functions and integrations of augmentations -**Description:** +**Description:** MedPipe3D was created as a package that improves integration between other parts of the small ecosystem (MedEye3D, MedEval3D, and MedImage). Currently, it needs to be expanded and adapted so it can be a basis for a fully functional medical imaging pipeline. It requires utilities for preprocessing specific to medical imaging - like uniformization of spacing, orientation, cropping, or padding. It needs to k fold cross validation and simple ensembling. Other necessary part of the segmentation pipeline are the augmentations that should be easier to use, and provide test time augmentation for uncertainty quantification. The last thing in the pipeline that is also important for practitioners is postprocessing - and the most popular postprocessing is finding and keeping only the largest connected component. - **Mentor:** Jakub Mitura [email: jakub.mitura14@gmail.com] @@ -139,14 +139,14 @@ MedPipe3D was created as a package that improves integration between other parts - **Duration**: 350 hours -- **Suggested Skills and Background**: +- **Suggested Skills and Background**: - Experience with Julia - - Familiarity with some of the following Julia packages would be a strong asset: - - MedEye3D.jl + - Familiarity with some of the following Julia packages would be a strong asset: + - MedEye3D.jl - MedEval3D.jl -- **Potential Outcomes:** +- **Potential Outcomes:** 1) Integrate augmentations like rotations recalling gamma etc. 2) Enable invertible augmentations and support test time augmentations. 3) Add patch-based data loading with probabilistic oversampling. @@ -160,9 +160,9 @@ This set of changes although time-consuming to implement should not pose a signi - **Success criteria and time needed:** How the success of functionality described above is defined and the approximate time required for each. -1) Given the configuration struct supplied by the user the supplied augmentations will be executed with some defined probability after loading the image: Brightness transform, Contrast augmentation transform, Gamma Transform, Gaussian noise transform, Rician noise transform, Mirror transform, Scale transform, Gaussian blur transform, Simulate low-resolution transform, Elastic deformation transform -100h. +1) Given the configuration struct supplied by the user the supplied augmentations will be executed with some defined probability after loading the image: Brightness transform, Contrast augmentation transform, Gamma Transform, Gaussian noise transform, Rician noise transform, Mirror transform, Scale transform, Gaussian blur transform, Simulate low-resolution transform, Elastic deformation transform -100h. 2) Enable some transformation to be executed on the model input, then inverse this transform on the model output; execute model inference n times when n is supplied by the user and return mean and standard deviation of segmentation masks produced by the model as the output -60h. -3) given the size of the 3D patch by the user algorithm after data loading will crop or pad the supplied image to meet the set size criterion. The part of the image where the label is present should be selected more frequently than the areas without during cropping, the probability that the area with some label indicated on segmentation mas will be chosen will equal p (0-1) where p is supplied by the user -40h. +3) given the size of the 3D patch by the user algorithm after data loading will crop or pad the supplied image to meet the set size criterion. The part of the image where the label is present should be selected more frequently than the areas without during cropping, the probability that the area with some label indicated on segmentation mas will be chosen will equal p (0-1) where p is supplied by the user -40h. 4) given the list of paths to medical images it will load them calculate the mean or median spacing (option selected by the user), and return it. Then during pipeline execution, all images should be resampled to a user-supplied spacing and user-supplied orientation - 40h. 5) Given a model output and a threshold that will be used for each channel of the output to binarize the output user will have an option to retrieve only n largest components from binarized algorithm output - 20h. 6) Probabilities and hyperparameters of all augmentations, thresholds for binarization of output channels chosen spacing for preprocessing, number and settings of test time augmentations should be available in a hyperparam struct that is the additional argument of the pipeline function and that can be used for hyperparameter tuning -30h. diff --git a/jsoc/gsoc/machine-learning.md b/jsoc/gsoc/machine-learning.md index 1eaa258282..35a5fdc825 100644 --- a/jsoc/gsoc/machine-learning.md +++ b/jsoc/gsoc/machine-learning.md @@ -1,4 +1,4 @@ -# Machine Learning Projects - Summer of Code +# Machine Learning Projects - Summer of Code **Note: FluxML participates as a NumFOCUS sub-organization. Head to [the FluxML GSoC page](http://fluxml.ai/gsoc/) for their idea list.** diff --git a/jsoc/gsoc/pluto.md b/jsoc/gsoc/pluto.md index 13c940fc8f..2833ef1728 100644 --- a/jsoc/gsoc/pluto.md +++ b/jsoc/gsoc/pluto.md @@ -1,5 +1,3 @@ -# Pluto.jl projects +# Pluto.jl projects - Summer of Code Unfortunately we won't have time to mentor this year.  Check back next year! - - diff --git a/jsoc/gsoc/pythia.md b/jsoc/gsoc/pythia.md index 44395439e8..a93e1f8959 100644 --- a/jsoc/gsoc/pythia.md +++ b/jsoc/gsoc/pythia.md @@ -1,4 +1,5 @@ # Pythia – Summer of Code + ## Machine Learning Time Series Regression [Pythia](https://github.com/ababii/Pythia.jl) is a package for scalable machine learning time series forecasting and nowcasting in Julia. @@ -7,9 +8,9 @@ The project mentors are [Andrii Babii](https://ababii.github.io/) and [Sebastian ## Machine learning for nowcasting and forecasting -This project involves developing scalable machine learning time series regressions for nowcasting and forecasting. Nowcasting in economics is the prediction of the present, the very near future, and the very recent past state of an economic indicator. The term is a contraction of "now" and "forecasting" and originates in meteorology. +This project involves developing scalable machine learning time series regressions for nowcasting and forecasting. Nowcasting in economics is the prediction of the present, the very near future, and the very recent past state of an economic indicator. The term is a contraction of "now" and "forecasting" and originates in meteorology. -The objective of this project is to introduce scalable regression-based nowcasting and forecasting methodologies that demonstrated the empirical success in data-rich environment recently. Examples of existing popular packages for regression-based nowcasting on other platforms include the "MIDAS Matlab Toolbox", as well as the 'midasr' and 'midasml' packages in R. The starting point for this project is porting the 'midasml' package from R to Julia. Currently Pythia has the sparse-group LASSO regression functionality for forecasting. +The objective of this project is to introduce scalable regression-based nowcasting and forecasting methodologies that demonstrated the empirical success in data-rich environment recently. Examples of existing popular packages for regression-based nowcasting on other platforms include the "MIDAS Matlab Toolbox", as well as the 'midasr' and 'midasml' packages in R. The starting point for this project is porting the 'midasml' package from R to Julia. Currently Pythia has the sparse-group LASSO regression functionality for forecasting. The following functions are of interest: in-sample and out-of sample forecasts/nowcasts, regularized MIDAS with Legendre polynomials, visualization of nowcasts, AIC/BIC and time series cross-validation tuning, forecast evaluation, pooled and fixed effects panel data regressions for forecasting and nowcasting, HAC-based inference for sparse-group LASSO, high-dimensional Granger causality tests. Other widely used existing functions from R/Python/Matlab are also of interest. @@ -21,7 +22,7 @@ The following functions are of interest: in-sample and out-of sample forecasts/n ## Time series forecasting at scales -Modern business applications often involve forecasting hundreds of thousands of time series. Producing such a gigantic number of reliable and high-quality forecasts is computationally challenging, which limits the scope of potential methods that can be used in practice, see, e.g., the 'forecast', 'fable', or 'prophet' packages in R. Currently, Julia lacks the scalable time series forecasting functionality and this project aims to develop the automated data-driven and scalable time series forecasting methods. +Modern business applications often involve forecasting hundreds of thousands of time series. Producing such a gigantic number of reliable and high-quality forecasts is computationally challenging, which limits the scope of potential methods that can be used in practice, see, e.g., the 'forecast', 'fable', or 'prophet' packages in R. Currently, Julia lacks the scalable time series forecasting functionality and this project aims to develop the automated data-driven and scalable time series forecasting methods. The following functionality is of interest: forecasting intermittent demand (Croston, adjusted Croston, INARMA), scalable seasonal ARIMA with covariates, loss-based forecasting (gradient boosting), unsupervised time series clustering, forecast combinations, unit root tests (ADF, KPSS). Other widely used existing functions from R/Python/Matlab are also of interest. diff --git a/jsoc/gsoc/quantumclifford.md b/jsoc/gsoc/quantumclifford.md index fea10dff65..0d5f25a126 100644 --- a/jsoc/gsoc/quantumclifford.md +++ b/jsoc/gsoc/quantumclifford.md @@ -1,4 +1,4 @@ -# Tools for simulation of Quantum Clifford Circuits +# Tools for simulation of Quantum Clifford Circuits - Summer of Code Clifford circuits are a class of quantum circuits that can be simulated efficiently on a classical computer. As such, they do not provide the computational advantage expected of universal quantum computers. Nonetheless, they are extremely important, as they underpin most techniques for quantum error correction and quantum networking. Software that efficiently simulates such circuits, at the scale of thousands or more qubits, is essential to the design of quantum hardware. The [QuantumClifford.jl](https://github.com/Krastanov/QuantumClifford.jl) Julia project enables such simulations. @@ -99,4 +99,4 @@ Magic states are important non-stabilizer states that can be used for inducing n **Expected duration:** 175 hours (but applicants can scope it as longer if they plan more extensive work) -**Difficulty:** Hard \ No newline at end of file +**Difficulty:** Hard diff --git a/jsoc/gsoc/quantumoptics.md b/jsoc/gsoc/quantumoptics.md index bc66d1a5e8..bc6941217c 100644 --- a/jsoc/gsoc/quantumoptics.md +++ b/jsoc/gsoc/quantumoptics.md @@ -1,4 +1,4 @@ -# Quantum Optics and State Vector Modeling Tools +# Quantum Optics and State Vector Modeling Tools - Summer of Code The most common way to represent and model quantum states is the state vector formalism (underlying Schroedinger's and Heisenberg's equations as well as many other master equations). The [QuantumOptics.jl](https://github.com/qojulia/QuantumOptics.jl) Julia project enables such simulations, utilizing much of the uniquely powerful DiffEq infrastructure in Julia. @@ -36,4 +36,4 @@ SciML is the umbrella organization for much of the base numerical software devel **Expected duration:** 175 hours (but applicants can scope it as longer if they plan more extensive work) -**Difficulty:** Easy \ No newline at end of file +**Difficulty:** Easy diff --git a/jsoc/gsoc/symbolics.md b/jsoc/gsoc/symbolics.md index 40e7c0b0cf..4cb2f8fb68 100644 --- a/jsoc/gsoc/symbolics.md +++ b/jsoc/gsoc/symbolics.md @@ -1,4 +1,4 @@ -# Symbolic computation project ideas +# Symbolic computation project ideas - Summer of Code ## Efficient Tensor Differentiation diff --git a/jsoc/gsoc/taija.md b/jsoc/gsoc/taija.md index 359fc05b84..9cd14eb762 100644 --- a/jsoc/gsoc/taija.md +++ b/jsoc/gsoc/taija.md @@ -1,4 +1,4 @@ -# Taija Projects +# Taija Projects - Summer of Code [Taija](https://github.com/JuliaTrustworthyAI) is an organization that hosts software geared towards Trustworthy Artificial Intelligence in Julia. Taija currently covers a range of approaches towards making AI systems more trustworthy: @@ -20,7 +20,7 @@ There is a high overlap with organizations, you might be also interested in: ## Project 1: Conformal Prediction meets Bayes (*Predictive Uncertainty*) -**Project Overview:** [ConformalPrediction.jl](https://github.com/JuliaTrustworthyAI/ConformalPrediction.jl) is a package for Predictive Uncertainty Quantification through Conformal Prediction for Machine Learning models trained in MLJ. This project aims to enhance ConformalPrediction.jl by adding support for [Conformal(ized) Bayes](https://github.com/JuliaTrustworthyAI/ConformalPrediction.jl/issues/64). +**Project Overview:** [ConformalPrediction.jl](https://github.com/JuliaTrustworthyAI/ConformalPrediction.jl) is a package for Predictive Uncertainty Quantification through Conformal Prediction for Machine Learning models trained in MLJ. This project aims to enhance ConformalPrediction.jl by adding support for [Conformal(ized) Bayes](https://github.com/JuliaTrustworthyAI/ConformalPrediction.jl/issues/64). **Mentor:** [Patrick Altmeyer](https://github.com/pat-alt) and/or [Mojtaba Farmanbar](https://nl.linkedin.com/in/mfarmanbar) @@ -41,7 +41,7 @@ There is a high overlap with organizations, you might be also interested in: ## Project 2: Counterfactual Regression (*Model Explainability*) -**Project Overview:** [CounterfactualExplanations.jl](https://github.com/JuliaTrustworthyAI/CounterfactualExplanations.jl) is a package for Counterfactual Explanations and Algorithmic Recourse in Julia. This project aims to extend the package functionality to [regression models](https://github.com/JuliaTrustworthyAI/CounterfactualExplanations.jl/issues/388). +**Project Overview:** [CounterfactualExplanations.jl](https://github.com/JuliaTrustworthyAI/CounterfactualExplanations.jl) is a package for Counterfactual Explanations and Algorithmic Recourse in Julia. This project aims to extend the package functionality to [regression models](https://github.com/JuliaTrustworthyAI/CounterfactualExplanations.jl/issues/388). **Mentor:** [Patrick Altmeyer](https://github.com/pat-alt) @@ -58,10 +58,10 @@ There is a high overlap with organizations, you might be also interested in: - Carefully think about architecture choices: how can we fit support for regression models into the existing code base? - Add support for the following approaches: [ad-hoc thresholding](https://github.com/JuliaTrustworthyAI/CounterfactualExplanations.jl/issues/391), [Bayesian optimisation](https://github.com/JuliaTrustworthyAI/CounterfactualExplanations.jl/issues/392), [information-theoretic saliency](https://openreview.net/forum?id=IrEYkhuxup¬eId=IrEYkhuxup). - Comprehensively test and document your work - + ## Project 3: Counterfactuals for LLMs (*Model Explainability* and *Generative AI*) -**Project Overview:** This project aims to extend the functionality of [CounterfactualExplanations.jl](https://github.com/JuliaTrustworthyAI/CounterfactualExplanations.jl) to Large Language Models (LLMs). As a backbone for this, support for computing feature attributions for LLMs will also need to be implemented. The project will contribute to both [Taija](https://github.com/JuliaTrustworthyAI) and [JuliaGenAI](https://github.com/JuliaGenAI). +**Project Overview:** This project aims to extend the functionality of [CounterfactualExplanations.jl](https://github.com/JuliaTrustworthyAI/CounterfactualExplanations.jl) to Large Language Models (LLMs). As a backbone for this, support for computing feature attributions for LLMs will also need to be implemented. The project will contribute to both [Taija](https://github.com/JuliaTrustworthyAI) and [JuliaGenAI](https://github.com/JuliaGenAI). **Mentor:** [Patrick Altmeyer](https://github.com/pat-alt) (Taija) and [Jan Siml](https://github.com/svilupp) (JuliaGenAI) @@ -95,7 +95,7 @@ This extension aims to enhance the CounterfactualExplanations.jl package by inco - Experience with Julia - Background in causality and familiarity with counterfactual reasoning. - Basic knowledge of minimal interventions and causal graph building. - + **Project Goals and Deliverables:** - Carefully think about architecture choices: how can we fit support for causal interventions into the existing code base? - Develop code that could integrate causal graph building with other Julia libs such as [Graphs.jl](https://github.com/JuliaGraphs/Graphs.jl), [GraphPlot.jl](https://juliagraphs.org/GraphPlot.jl/) and [CausalInference.jl](https://github.com/mschauer/CausalInference.jl). @@ -104,7 +104,7 @@ This extension aims to enhance the CounterfactualExplanations.jl package by inco ## About Us -[Patrick Altmeyer](https://www.paltmeyer.com/) is a PhD Candidate in Trustworthy Artificial Intelligence at Delft University of Technology working on the intersection of Computer Science and Finance. He has presented work related to Taija at JuliaCon 2022 and 2023. In the past year, Patrick has mentored multiple groups of students at Delft University of Technology who have made major contributions to Taija. +[Patrick Altmeyer](https://www.paltmeyer.com/) is a PhD Candidate in Trustworthy Artificial Intelligence at Delft University of Technology working on the intersection of Computer Science and Finance. He has presented work related to Taija at JuliaCon 2022 and 2023. In the past year, Patrick has mentored multiple groups of students at Delft University of Technology who have made major contributions to Taija. ## How to Contact Us diff --git a/jsoc/gsoc/tooling.md b/jsoc/gsoc/tooling.md index 784c44ae1e..f14b70aa23 100644 --- a/jsoc/gsoc/tooling.md +++ b/jsoc/gsoc/tooling.md @@ -1,2 +1,17 @@ -# Tooling +# Tooling - Summer of Code +## Automation of testing / performance benchmarking (350 hours) + +The Nanosoldier.jl project (and related ) tests for +performance impacts of some changes. However, there remains many areas that are not covered (such as +compile time) while other areas are over-covered (greatly increasing the duration of the test for no +benefit) and some tests may not be configured appropriately for statistical power. Furthermore, the +current reports are very primitive and can only do a basic pair-wise comparison, while graphs and +other interactive tooling would be more valuable. Thus, there would be many great projects for a +summer contributor to tackle here! + +**Expected Outcomes**: Improvement of Julia's automated testing/benchmarking framework. +**Skills**: Interest in and/or experience with CI systems. +**Difficulty**: Medium + +**Contact:** [Jameson Nash](https://github.com/vtjnash), [Tim Besard](https://github.com/maleadt) diff --git a/jsoc/gsoc/trixi.md b/jsoc/gsoc/trixi.md index 16c9fef0bc..2056e0c48d 100644 --- a/jsoc/gsoc/trixi.md +++ b/jsoc/gsoc/trixi.md @@ -1,4 +1,4 @@ -# Modern computational fluid dynamics with Trixi.jl +# Modern computational fluid dynamics with Trixi.jl - Summer of Code [Trixi.jl](https://github.com/trixi-framework/Trixi.jl/) is a Julia package for adaptive high-order numerical simulations of conservation laws. It is designed to be simple to use @@ -97,10 +97,10 @@ This project is good for both software engineers interested in the fields of sci The possible subtasks in this project include: -- Implementing the abstract tree initialization process on GPUs. -- Exploring the [`TreeMesh`](https://trixi-framework.github.io/Trixi.jl/stable/meshes/tree_mesh/) initialization processes on GPUs based on the implementation of the first task and combining them. -- Integrating the above into [`AMRCallback`](https://trixi-framework.github.io/Trixi.jl/stable/tutorials/adaptive_mesh_refinement/#Callback) in the simulation using [dynamic parallelism](https://cuda.juliagpu.org/stable/api/kernel/#Dynamic-parallelism) (via CUDA.jl). -- Optimizing the code for data transfer, kernel launch overhead, occupancy, etc. +- Implementing the abstract tree initialization process on GPUs. +- Exploring the [`TreeMesh`](https://trixi-framework.github.io/Trixi.jl/stable/meshes/tree_mesh/) initialization processes on GPUs based on the implementation of the first task and combining them. +- Integrating the above into [`AMRCallback`](https://trixi-framework.github.io/Trixi.jl/stable/tutorials/adaptive_mesh_refinement/#Callback) in the simulation using [dynamic parallelism](https://cuda.juliagpu.org/stable/api/kernel/#Dynamic-parallelism) (via CUDA.jl). +- Optimizing the code for data transfer, kernel launch overhead, occupancy, etc. - Starting the above work in 1D and then expanding it to 2D and 3D problems. - (Optional) Try similar work for [`P4estMesh`](https://trixi-framework.github.io/Trixi.jl/stable/meshes/p4est_mesh/) in 2D and 3D. @@ -110,4 +110,4 @@ This project is good for people who are interested in GPU programming, parallel **Expected results:** A working example of AMR running on GPUs. -**Mentors**: [Huiyu Xie](https://github.com/huiyuxie), [Jesse Chan](https://github.com/jlchan), [Hendrik Ranocha](https://github.com/ranocha) \ No newline at end of file +**Mentors**: [Huiyu Xie](https://github.com/huiyuxie), [Jesse Chan](https://github.com/jlchan), [Hendrik Ranocha](https://github.com/ranocha) diff --git a/jsoc/gsoc/vscode.md b/jsoc/gsoc/vscode.md index 7b64181f0e..74b96914ea 100644 --- a/jsoc/gsoc/vscode.md +++ b/jsoc/gsoc/vscode.md @@ -1,4 +1,4 @@ -# VS Code projects +# VS Code projects - Summer of Code ## VS Code extension diff --git a/jsoc/gsod/projects.md b/jsoc/gsod/projects.md index 66360a8a37..3a736067f3 100644 --- a/jsoc/gsod/projects.md +++ b/jsoc/gsod/projects.md @@ -17,9 +17,9 @@ Below you can find a running list of potential GSoD projects. If any of these ar # Unifying the [JuliaHeath Organization](https://github.com/JuliaHealth) Documentation Landscape -## About your organization -> [!Note] -In this section, tell us about your organization or project in a few short paragraphs. What problem does your project solve? Who are your users and contributors? How long has your organization or project been in existence? Give some context to help us understand why funding your proposal would create a positive impact in open source and the world. +## About your organization + + The Julia Programming Language is an MIT-licensed high-performance programming language designed for speed, usability, and reproducibility within both scientific and general purpose computing. Currently the Julia community has over 7,000 registered Julia packages, 35 million+ downloads of Julia, and thousands of contributors worldwide. @@ -34,8 +34,8 @@ As the entire JuliaHealth user community comprises more than 250 registered user Currently, there are various subecosystems such as the Medical Imaging and the Observational Health subecosystem with more subecosystems beginning to emerge. ## Your project’s problem -> [!Note] -Tell us about the problem your project will help solve. Why is it important to your organization or project to solve this problem? + + With JuliaHealth's terrific growth over the years -- both in terms of growth in users, members, and actively maintained packages -- we are beginning to see the need for more unified documentation. Without this unified documentation, we are seeing: @@ -55,8 +55,8 @@ Users and developers want to engage with the JuliaHealth community, but if we do - Additionally, as we construct solutions within JuliaHealth to address the needs we have encountered as a growing organization, we will share our insights to the broader Julia community to illustrate various methods other ecosystems within Julia can adapt to meet growing demand. ## Your project’s scope -> [!Note] -Tell us about what documentation your organization will create, update, or improve. If some work is deliberately not being done, include that information as well. Include a time estimate, and whether you have already identified organization volunteers and a technical writer to work with your project. + + Although there are many subecosystems within JuliaHealth, our project will be scoped to specifically the Medical Imaging subecosystem as it has grown mature enough to encounter many of these problems already. Working on documentation around the Medical Imagining subecosystem will benefit the rest of the JuliaHealth ecosystem as it will provide a roadmap for how other subecosystems can best document themselves and support their users. @@ -74,12 +74,12 @@ This includes: - Add FAQ or support page - Define and implement tracking metrics to monitor user engagement and interaction with the platform - Using an open source and GDPR compliant technology like GoatCounter - + Once this initial groundwork is done, we will then address some of the specific core tooling within the Medical Imaging subecosystem. Due to the modular nature of packages within this subecosystem, we will need to improve documentation across various packages to show what they should be used for, how they integrate with one another, and how to onboard as a potential new contributor: - Documentation tasks for [MedImage]( https://github.com/JuliaHealth/MedImage.jl) - - Introduction to the theory of medical imaging formats and spatial metadata + - Introduction to the theory of medical imaging formats and spatial metadata - Describe how to load and save image - Describe how apply basic transformation using MedImage - Documentation tasks for [MedEye3d](https://github.com/JuliaHealth/MedEye3d.jl) @@ -161,8 +161,8 @@ To explicitly enumerate what work is out of scope for this project, we do not pl - Adding docstrings or crosslinks may fall in scope depending on the needs per task ## Measuring your project’s success -> [!Note] -How will you know that your new documentation has helped solve your problem? What metrics will you use, and how will you track them? + + Currently, the documentation we do have does not yet have support for documentation traffic analytics. As of this moment, our best direct source for traffic metrics is to use [JuliaHub](https://juliahub.com/ui/Packages?q=JuliaHealth/) to monitor package downloads and also to reference GitHub stars for a loose approximation of "discoverability". @@ -188,8 +188,8 @@ For JuliaHealth, we would consider this project successful if: - A new blog post is published ## Timeline -> [!Note] -How long do you estimate this work will take? Are you able to breakdown the tech writer tasks by month/week? + + We assume the tech writer will put in part time hours (10-20 hours/week) during this time. ### Monthly Plan @@ -221,15 +221,15 @@ We assume the tech writer will put in part time hours (10-20 hours/week) during This timeline is largely accurate but we expect that different packages or tasks may be slightly more challenging than others. The November time period gives us the opportunity to revisit any unfinished tasks and to potentially explore stretch goals if there were not many outstanding tasks left. -### Communication Plan: +### Communication Plan: The primary communication channel we will use is [Julia Slack](https://julialang.org/slack/) and Dr. Jakub Mitura (MD, PhD) will be the individual responsible for all contact and mentoring throughout the project for regular updates and meetings. Outside of Slack, email will be used to handle communications with GSoD organizers and administrators with Jakub Mitura's email being: [jakub.mitura14@gmail.com](mailto:jakub.mitura14@gmail.com). Volunteers will also be available for communication on the Slack on an as-needed basis. Additionally, project updates will be given through the [Julia Health Slack Channel](https://app.slack.com/client/T68168MUP/C012NN70P5K) which is where the majority of JuliaHealth communication takes place between members, users, and the rest of the Julia community. ## Project Budget -> [!Note] -You can include your budget in your proposal or as a separate link. If your budget is fewer than ten items, we recommend including it in your proposal. + + | Budget item | Amount | Running total | | ------------- | ------------- | ------------- | @@ -245,8 +245,8 @@ Additional justifications: - Sticker packs will also be given to welcome new contributors ### Additional information: -> [!Note] -Beyond the above proposal information, some additional notes about the composition of this project team: + + **About GSoD Project Lead:** diff --git a/jsoc/projects.md b/jsoc/projects.md index 1803811117..4ff296f798 100644 --- a/jsoc/projects.md +++ b/jsoc/projects.md @@ -13,6 +13,7 @@ We have our project ideas organized below roughly by domain but you can also see * [Graph neural networks](/jsoc/gsoc/gnn/) - Deep learning on graphs with GraphNeuralNetworks.jl. * [GUI](/jsoc/gsoc/gui/) - Projects related to Graphical User Interface toolkits * [High Performance and Parallel Computing](/jsoc/gsoc/hpc/) – write code that runs on lots of machines, goes really fast, processes lots of data, or all three. +* [GPU Programming](/jsoc/gsoc/gpu/) - Projects that involve the Julia GPU stack * [Images](/jsoc/gsoc/images/) – extend Julia's suite of tools for visualization and analysis of images. * [JuliaConstraints](/jsoc/gsoc/juliaconstraints/) - A collection of tools for Constraint Programming in Julia * [JuliaDynamics](/jsoc/gsoc/juliadynamics/) - Dynamical systems, complex systems and nonlinear dynamics in Julia diff --git a/utils.jl b/utils.jl index 0c811bcc5e..8147adb2b5 100644 --- a/utils.jl +++ b/utils.jl @@ -229,7 +229,22 @@ function hfun_all_gsoc_projects() for project in all_projects project in ("general.md", "tooling.md", "graphics.md") && continue endswith(project, ".md") || continue - write(md, read(joinpath(base_dir, project))) + for line in eachline(joinpath(base_dir, project)) + # remove any table of contents + if line == "\\toc" + continue + end + + # remove ' - Summer of Code' suffix from the primary header + line = replace(line, r"^# (.*) - Summer of Code" => s"# \1") + + # increase the header level by 1 + if startswith(line, "#") + line = "#" * line + end + + println(md, line) + end write(md, "\n\n") end allmd = String(take!(md))