diff --git a/DESCRIPTION b/DESCRIPTION
index 30837da7f..48538a9a4 100644
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,7 +1,7 @@
Package: gcamfaostat
Type: Package
Title: Prepare, process, and synthesize FAOSTAT data for global agroeconomic and multisector dynamic modeling
-Version: 1.1
+Version: 1.0.0
Date: 2023-05-06
Authors@R: c(person("Xin", "Zhao", email = "xin.zhao@pnnl.gov", role = c("cre", "aut"), comment = c(ORCID = "0000-0002-1801-4393")),
person("Maksym", "Chepeliev", role = "aut"),
diff --git a/README.md b/README.md
index 4f4bbf1c4..42b4889b3 100644
--- a/README.md
+++ b/README.md
@@ -1,11 +1,12 @@
-[![codecov](https://codecov.io/gh/JGCRI/gcamdata/branch/main/graph/badge.svg)](https://codecov.io/gh/JGCRI/gcamdata)
-![R-CMD](https://github.com/JGCRI/gcamdata/workflows/R-CMD/badge.svg)
-![coverage-test](https://github.com/JGCRI/gcamdata/workflows/coverage-test/badge.svg)
+# gcamfaostat
+**gcamfaostat** is an R package to prepare, process, and synthesize FAOSTAT data for global agroeconomic and multisector dynamic modeling. The Food and Agriculture Organization Statistical Database ([FAOSTAT](https://www.fao.org/faostat/en/#data)) provdes open access data on country-level agricultural production, trade, food, nutrients, prices, land use, etc, servering as the most important data source for global agroeconomic and multisector dynamic models. **gcamfaostat** aims to shorten the distance between the FAOSTAT raw data to economic modeling.
+
+# gcamfaostat and gcamdata
+**gcamfaostat** is built based on an existing R package, **[gcamdata](https://jgcri.github.io/gcamdata/index.html)**, which has similar functions to**gcamfaostat** though **gcamdata** includes broader aspects of data inputs and is designed for the global multisector dynamic model **GCAM**. **gcamfaostat** utilizes the robust, reproducible and transparent data processing systems built in **[gcamdata](https://github.com/JGCRI/gcam-core)**. The two packages are consistent, while **gcamfaostat** focuses on agroeconomic data processing and can provide input data for **gcamdata** (and thus GCAM) and other models that rely on FAOSTAT data.
+
-# gcamdata-faostat
-Functions in this R package (**gcamdata-faostat**) download, clean, processe, connect, and visualize data from FAOSTAT for global economic and integrated assessment modeling. The package is built based on the existing gcamdata package structure for consistency, transparency, and traceability.
The goals of this version are:
(1) Check FAOSTAT data updates and download necessary datasets
@@ -13,6 +14,45 @@ The goals of this version are:
(3) Apply the new method of primary equivalent aggregation to aggregating FAO ~500 SUA (SCL) items to ~100 primary equivalent items in FAO Food Balance Sheet (FBS).
(4) Compare the balanced data compiled using different methods and visualize the difference.
+# User Guide
+The package is documented in the [online manual](https://realxinzhao.github.io/gcamfaostat/index.html)
+
+
+# Download and install:
+
+```r
+install.packages("devtools")
+devtools::install_github("realxinzhao/gcamfaostat")
+```
+# Loading and run the gcamdata package
+
+Open the `gcamfaostat.Rproj` file in the `gcamfaostat` folder. RStudio should open the project.
+
+To load the `gcamdata` package, enter:
+
+```{r eval = FALSE}
+devtools::load_all()
+```
+
+## Run the driver
+There are two ways to run the driver:
+1.
+```{r eval = FALSE}
+driver_drake()
+```
+`driver_drake()` runs the driver and stores the outputs in a hidden cache. When you run `driver_drake()` again it will skip steps that are up-to-date. This is useful if you will be adjusting the data inputs and code and running the data system multiple times. For this reason, we almost always recommend using `driver_drake()`. More details can be found in the [vignette](https://jgcri.github.io/gcamdata/articles/driverdrake_vignette.html).
+
+2.
+```{r eval = FALSE}
+driver()
+```
+See [the documentation](https://jgcri.github.io/gcamdata/reference/driver.html) for more options when running the driver, such as what outputs to generate or when to stop.
+
+## Output files
+
+Users can specify the output directory (`DIR_OUTPUT_CSV`) that stores the output csv files in `constants.R`. The default directory is `outputs/CSV`. The the file will be exported when `OUTPUT_Export_CSV == TRUE` (an option in `constants.R`).
+Users can also make use of the functions to trace the processing by step, when`driver_drake()` is employed.
+
diff --git a/_pkgdown.yml b/_pkgdown.yml
index 30b6a77e2..a19d7319c 100644
--- a/_pkgdown.yml
+++ b/_pkgdown.yml
@@ -15,12 +15,21 @@ navbar:
left:
- icon: fa-home
href: index.html
- - text: "Getting Started"
- href: articles/getting-started/getting-started.html
- text: "Vignettes"
+ icon: fas fa-book
+ menu:
+ - text: "getting started"
+ href: articles/getting-started.html
+ - text: "User Modification Functions"
+ href: articles/usermod_vignette.html
- icon: fa-file-code-o
text: "Reference"
href: reference/index.html
+ - icon: fa-newspaper-o
+ text: "News"
+ href: articles/news.html
+
+
reference:
- title: Running gcamfaostat
diff --git a/docs/404.html b/docs/404.html
index dc8a02de6..f119083ba 100644
--- a/docs/404.html
+++ b/docs/404.html
@@ -32,7 +32,7 @@
gcamfaostat
- 1.1
+ 1.0.0
@@ -44,16 +44,36 @@
+
First release TBD The first release of gcamfaostat
+1.0.0 includes the data generated for the Global Change Analysis Model
+v7.0 GCAM
+v7.0. The source data downloaded from FAOSTAT is archived at a Zenodo repository.
Users may want to change the default gcamdata behavior by either
+modifying input assumptions or changing intermediate chunks. They can
+now write a “user modification” chunk that can be “plugged in” to the
+data system. This new chunk can modify any objects that are used or
+created in gcamdata and pass the modified object to all dependent
+chunks.
+
User-modification chunks have a format similar to other data chunks
+in gcamdata, except that instead of producing a new output, it returns a
+modified data object that replaces the original data object in the data
+system. These new chunks can be added to driver_drake() or
+driver() using the new arguments
+user_modifications and xml_suffix, which tell
+gcamdata which modification function to use and what suffix to add to
+all impacted downstream xmls.
+
+
+
Example: Modify Shareweight
+
+
Below we show an example user-modification chunk to change a
+shareweight in an input csv file.
+
+
User Modification Chunk
+
+
Here we load in two csv files, “energy/A322.subsector_shrwt.csv” and
+“common/GCAM_region_names.csv”. We modify A322.subsector_shrwt, so we
+list it under driver.DECLARE_MODIFY, but do not modify
+GCAM_region_names, so it is listed under
+driver.DECLARE_INPUTS. Then, we set the shareweight column
+of the first row of A322.subsector_shrwt to NEW.SHWT.
+Finally, we use a new return_modified() function to return
+the modified A322.subsector_shrwt (note that we have to include the path
+for input files).
+
+usermod_fert<-function(command, ...){
+ if(command==driver.DECLARE_MODIFY){
+ return(c(FILE ="energy/A322.subsector_shrwt"))
+ }elseif(command==driver.DECLARE_INPUTS){
+ # In addition to the objects users want to modify we can also ask for any other
+ # inputs we need to do our operations but won't be modified
+ return(c(FILE ="common/GCAM_region_names"))
+ }elseif(command==driver.MAKE){
+ all_data<-list(...)[[1]]
+ GCAM_region_names<-get_data(all_data, "common/GCAM_region_names")
+ A322.subsector_shrwt<-get_data(all_data, "energy/A322.subsector_shrwt")
+
+ # Users could also read in additional files that exist outside of the data system
+ # They should do that manually instead of through the driver.DECLARE_INPUTS so as to
+ # avoid mixing user's custom files with Core files
+ # A23.globaltech_eff.mine <- read_csv("/path/to/my/custom/A23.globaltech_eff_with_random_changes.csv")
+
+ # Make some changes...
+ A322.subsector_shrwt<-A322.subsector_shrwt%>%
+ mutate(share.weight =as.double(share.weight),
+ year =as.integer(year))
+ A322.subsector_shrwt[1,"share.weight"]<-NEW.SHWT
+
+ # NOTE: we have to match the original object name we asked for in driver.DECLARE_MODIFY,
+ # which means including the file path for input files
+ # i.e. "energy/A322.subsector_shrwt" not "A322.subsector_shrwt"
+ # Other objects can be listed out just like for `return_data`
+ return_modified("energy/A322.subsector_shrwt"=A322.subsector_shrwt)
+
+ }else{
+ stop("Unknown command")
+ }
+}
+
+
+
Run usermod_fert once
+
+
To include our modification, we include this new chunk in our call to
+driver_drake() and also include a suffix to append to any
+affected objects (currently mandatory to include suffix).
+
Because we used the constant NEW.SHWT to assign the new
+value in our function, we first need to set it here.
+
+NEW.SHWT<-0.5
+
+driver_drake(user_modifications =c("usermod_fert"),
+ xml_suffix ="__1")# output xml will be saved as ORIGINALNAME_001.xml
+
+
+
Run usermod_fert multiple times
+
+
We can also generate multiple modified xmls using
+driver_drake(). To do this, we simply need to change the
+value of NEW.SHWT and ensure that each different value is
+associated with a different xml_suffix. As well, we need to
+clear the usermod_fert object from drake’s cache using
+drake::clean() as drake is not aware of the change to
+NEW.SHWT. If you do not include this call, drake may assume
+that all downstream objects/xmls do not need to be updated.
+
+# Multiple shareweights to use
+shareweights<-seq(0.2, 1, 0.1)
+
+for(iin1:length(shareweights)){
+ drake::clean(list="usermod_fert")# Ensures that drakes knows to run usermod_fert
+
+ NEW.SHWT<-shareweights[i]
+
+ driver_drake(user_modifications =c("usermod_fert"),
+ xml_suffix =paste0("__", i))
+}
Zhao X, Chepeliev M, Patel P, Narayan K, Wise M (2023).
gcamfaostat: Prepare, process, and synthesize FAOSTAT data for global agroeconomic and multisector dynamic modeling.
-R package version 1.1.
+R package version 1.0.0.
@Manual{,
title = {gcamfaostat: Prepare, process, and synthesize FAOSTAT data for global agroeconomic and multisector dynamic modeling},
author = {Xin Zhao and Maksym Chepeliev and Pralit Patel and Kanishka Narayan and Marshall Wise},
year = {2023},
- note = {R package version 1.1},
+ note = {R package version 1.0.0},
}
Functions in this R package (gcamdata-faostat) download, clean, processe, connect, and visualize data from FAOSTAT for global economic and integrated assessment modeling. The package is built based on the existing gcamdata package structure for consistency, transparency, and traceability.
+
gcamfaostat is an R package to prepare, process, and synthesize FAOSTAT data for global agroeconomic and multisector dynamic modeling. The Food and Agriculture Organization Statistical Database (FAOSTAT) provdes open access data on country-level agricultural production, trade, food, nutrients, prices, land use, etc, servering as the most important data source for global agroeconomic and multisector dynamic models. gcamfaostat aims to shorten the distance between the FAOSTAT raw data to economic modeling.
+
+
+
gcamfaostat and gcamdata
+
+
gcamfaostat is built based on an existing R package, gcamdata, which has similar functions togcamfaostat though gcamdata includes broader aspects of data inputs and is designed for the global multisector dynamic model GCAM. gcamfaostat utilizes the robust, reproducible and transparent data processing systems built in gcamdata. The two packages are consistent, while gcamfaostat focuses on agroeconomic data processing and can provide input data for gcamdata (and thus GCAM) and other models that rely on FAOSTAT data.
The goals of this version are: (1) Check FAOSTAT data updates and download necessary datasets (2) Develop a new method of primary equivalent aggregation to aggregate supply-utilization-accounting (SUA) data for items along the supply chain (e.g., wheat flour, bran, and germ to wheat-primary-equivalent). The method preserves balance across space (trade balance), time (storage carryover), supply-utilization, and the combination of these dimensions with minimal adjustments. (3) Apply the new method of primary equivalent aggregation to aggregating FAO ~500 SUA (SCL) items to ~100 primary equivalent items in FAO Food Balance Sheet (FBS). (4) Compare the balanced data compiled using different methods and visualize the difference.
Open the gcamfaostat.Rproj file in the gcamfaostat folder. RStudio should open the project.
+
To load the gcamdata package, enter:
+
{r eval = FALSE} devtools::load_all()
+
+
Run the driver
+
+
There are two ways to run the driver: 1. {r eval = FALSE} driver_drake()driver_drake() runs the driver and stores the outputs in a hidden cache. When you run driver_drake() again it will skip steps that are up-to-date. This is useful if you will be adjusting the data inputs and code and running the data system multiple times. For this reason, we almost always recommend using driver_drake(). More details can be found in the vignette.
+
+
+{r eval = FALSE} driver() See the documentation for more options when running the driver, such as what outputs to generate or when to stop.
+
+
+
+
Output files
+
+
Users can specify the output directory (DIR_OUTPUT_CSV) that stores the output csv files in constants.R. The default directory is outputs/CSV. The the file will be exported when OUTPUT_Export_CSV == TRUE (an option in constants.R).
+Users can also make use of the functions to trace the processing by step, whendriver_drake() is employed.
Copyright 2019 Battelle Memorial Institute; see the LICENSE file.
diff --git a/docs/sitemap.xml b/docs/sitemap.xml
index d7ccb08b8..602ecae72 100644
--- a/docs/sitemap.xml
+++ b/docs/sitemap.xml
@@ -4,13 +4,19 @@
https://realxinzhao.github.io/gcamfaostat/404.html
- https://realxinzhao.github.io/gcamfaostat/articles/gcamfaostat.html
+ https://realxinzhao.github.io/gcamfaostat/articles/getting-started.html
- https://realxinzhao.github.io/gcamfaostat/articles/getting-started/getting-started.html
+ https://realxinzhao.github.io/gcamfaostat/articles/index.html
- https://realxinzhao.github.io/gcamfaostat/articles/index.html
+ https://realxinzhao.github.io/gcamfaostat/articles/news/NEWS.html
+
+
+ https://realxinzhao.github.io/gcamfaostat/articles/news.html
+
+
+ https://realxinzhao.github.io/gcamfaostat/articles/usermod_vignette.htmlhttps://realxinzhao.github.io/gcamfaostat/authors.html
diff --git a/vignettes/gcamfaostat.Rmd b/vignettes/gcamfaostat.Rmd
deleted file mode 100644
index 398cb672f..000000000
--- a/vignettes/gcamfaostat.Rmd
+++ /dev/null
@@ -1,23 +0,0 @@
----
-title: "Introduction to gcamfaostat"
-output: rmarkdown::html_vignette
-vignette: >
- %\VignetteIndexEntry{Introduction to gcamfaostat}
- %\VignetteEngine{knitr::rmarkdown}
- %\VignetteEncoding{UTF-8}
----
-
-```{r, include = FALSE}
-knitr::opts_chunk$set(
- collapse = TRUE,
- comment = "#>"
-)
-```
-
-
-
-
-
-```{r setup}
-library(gcamdata)
-```
diff --git a/vignettes/getting-started/getting-started.Rmd b/vignettes/getting-started.Rmd
similarity index 97%
rename from vignettes/getting-started/getting-started.Rmd
rename to vignettes/getting-started.Rmd
index 0e43bbfe1..51ad34769 100644
--- a/vignettes/getting-started/getting-started.Rmd
+++ b/vignettes/getting-started.Rmd
@@ -1,9 +1,9 @@
---
-title: "Getting Started with gcamdata"
+title: "Getting Started with gcamfaostat"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
- %\VignetteIndexEntry{Getting Started with gcamdata}
+ %\VignetteIndexEntry{Getting Started with gcamfaostat}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
diff --git a/vignettes/news.Rmd b/vignettes/news.Rmd
new file mode 100644
index 000000000..3959c4796
--- /dev/null
+++ b/vignettes/news.Rmd
@@ -0,0 +1,26 @@
+---
+title: "news"
+date: "`r Sys.Date()`"
+output: rmarkdown::html_vignette
+vignette: >
+ %\VignetteIndexEntry{news}
+ %\VignetteEngine{knitr::rmarkdown}
+ %\VignetteEncoding{UTF-8}
+---
+
+```{r, include = FALSE}
+knitr::opts_chunk$set(
+ collapse = TRUE,
+ comment = "#>"
+)
+```
+
+```{r setup}
+library(gcamfaostat)
+```
+
+# gcamfaostat 1.0.0
+
+**First release TBD**
+The first release of gcamfaostat 1.0.0 includes the data generated for the Global Change Analysis Model v7.0 [GCAM v7.0](https://github.com/JGCRI/gcam-core/releases/tag/gcam-v7.0). The source data downloaded from FAOSTAT is archived at a [Zenodo repository]( https://zenodo.org/deposit/8260225).
+
diff --git a/vignettes/usermod_vignette.Rmd b/vignettes/usermod_vignette.Rmd
new file mode 100644
index 000000000..a474d0376
--- /dev/null
+++ b/vignettes/usermod_vignette.Rmd
@@ -0,0 +1,95 @@
+---
+title: "How to Write a User Modification Chunk"
+date: "`r Sys.Date()`"
+output: rmarkdown::html_vignette
+vignette: >
+ %\VignetteIndexEntry{usermod_chunks}
+ %\VignetteEngine{knitr::rmarkdown}
+ %\VignetteEncoding{UTF-8}
+---
+
+```{r setup, include = FALSE}
+knitr::opts_chunk$set(
+ collapse = TRUE,
+ comment = "#>"
+)
+library(devtools)
+devtools::load_all()
+```
+
+## Introduction
+Users may want to change the default gcamdata behavior by either modifying input assumptions or changing intermediate chunks. They can now write a "user modification" chunk that can be "plugged in" to the data system. This new chunk can modify any objects that are used or created in gcamdata and pass the modified object to all dependent chunks.
+
+User-modification chunks have a format similar to other data chunks in gcamdata, except that instead of producing a new output, it returns a modified data object that replaces the original data object in the data system. These new chunks can be added to `driver_drake()` or `driver()` using the new arguments `user_modifications` and `xml_suffix`, which tell gcamdata which modification function to use and what suffix to add to all impacted downstream xmls.
+
+## Example: Modify Shareweight
+Below we show an example user-modification chunk to change a shareweight in an input csv file.
+
+### User Modification Chunk
+Here we load in two csv files, "energy/A322.subsector_shrwt.csv" and "common/GCAM_region_names.csv". We modify A322.subsector_shrwt, so we list it under `driver.DECLARE_MODIFY`, but do not modify GCAM_region_names, so it is listed under `driver.DECLARE_INPUTS`. Then, we set the shareweight column of the first row of A322.subsector_shrwt to `NEW.SHWT`. Finally, we use a new `return_modified()` function to return the modified A322.subsector_shrwt (note that we have to include the path for input files).
+
+
+``` {r}
+usermod_fert <- function(command, ...) {
+ if(command == driver.DECLARE_MODIFY) {
+ return(c(FILE = "energy/A322.subsector_shrwt"))
+ } else if(command == driver.DECLARE_INPUTS) {
+ # In addition to the objects users want to modify we can also ask for any other
+ # inputs we need to do our operations but won't be modified
+ return(c(FILE = "common/GCAM_region_names"))
+ } else if(command == driver.MAKE) {
+ all_data <- list(...)[[1]]
+ GCAM_region_names <- get_data(all_data, "common/GCAM_region_names")
+ A322.subsector_shrwt <- get_data(all_data, "energy/A322.subsector_shrwt")
+
+ # Users could also read in additional files that exist outside of the data system
+ # They should do that manually instead of through the driver.DECLARE_INPUTS so as to
+ # avoid mixing user's custom files with Core files
+ # A23.globaltech_eff.mine <- read_csv("/path/to/my/custom/A23.globaltech_eff_with_random_changes.csv")
+
+ # Make some changes...
+ A322.subsector_shrwt <- A322.subsector_shrwt %>%
+ mutate(share.weight = as.double(share.weight),
+ year = as.integer(year))
+ A322.subsector_shrwt[1,"share.weight"] <- NEW.SHWT
+
+ # NOTE: we have to match the original object name we asked for in driver.DECLARE_MODIFY,
+ # which means including the file path for input files
+ # i.e. "energy/A322.subsector_shrwt" not "A322.subsector_shrwt"
+ # Other objects can be listed out just like for `return_data`
+ return_modified("energy/A322.subsector_shrwt" = A322.subsector_shrwt)
+
+ } else {
+ stop("Unknown command")
+ }
+}
+```
+
+### Run usermod_fert once
+To include our modification, we include this new chunk in our call to `driver_drake()` and also include a suffix to append to any affected objects (currently mandatory to include suffix).
+
+Because we used the constant `NEW.SHWT` to assign the new value in our function, we first need to set it here.
+``` {r eval=FALSE}
+NEW.SHWT <- 0.5
+
+driver_drake(user_modifications = c("usermod_fert"),
+ xml_suffix = "__1") # output xml will be saved as ORIGINALNAME_001.xml
+```
+
+
+### Run usermod_fert multiple times
+We can also generate multiple modified xmls using `driver_drake()`. To do this, we simply need to change the value of `NEW.SHWT` and ensure that each different value is associated with a different `xml_suffix`. As well, we need to clear the usermod_fert object from drake's cache using `drake::clean()` as drake is not aware of the change to `NEW.SHWT`. If you do not include this call, drake may assume that all downstream objects/xmls do not need to be updated.
+
+``` {r eval=FALSE}
+# Multiple shareweights to use
+shareweights <- seq(0.2, 1, 0.1)
+
+for (i in 1:length(shareweights)){
+ drake::clean(list="usermod_fert") # Ensures that drakes knows to run usermod_fert
+
+ NEW.SHWT <- shareweights[i]
+
+ driver_drake(user_modifications = c("usermod_fert"),
+ xml_suffix = paste0("__", i))
+}
+```