diff --git a/.Rbuildignore b/.Rbuildignore new file mode 100644 index 0000000..91114bf --- /dev/null +++ b/.Rbuildignore @@ -0,0 +1,2 @@ +^.*\.Rproj$ +^\.Rproj\.user$ diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..5b6a065 --- /dev/null +++ b/.gitignore @@ -0,0 +1,4 @@ +.Rproj.user +.Rhistory +.RData +.Ruserdata diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..5ceaa70 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,38 @@ +# Welcome! +Thank you for contributing to CDC's Open Source projects! If you have any +questions or doubts, don't be afraid to send them our way. We appreciate all +contributions, and we are looking forward to fostering an open, transparent, and +collaborative environment. + +Before contributing, we encourage you to also read or [LICENSE](https://github.com/CDCgov/template/blob/master/LICENSE), +[README](https://github.com/CDCgov/template/blob/master/README.md), and +[code-of-conduct](https://github.com/CDCgov/template/blob/master/code-of-conduct.md) +files, also found in this repository. If you have any inquiries or questions not +answered by the content of this repository, feel free to [contact us](mailto:surveillanceplatform@cdc.gov). + +## Public Domain +This project is in the public domain within the United States, and copyright and +related rights in the work worldwide are waived through the [CC0 1.0 Universal public domain dedication](https://creativecommons.org/publicdomain/zero/1.0/). +All contributions to this project will be released under the CC0 dedication. By +submitting a pull request you are agreeing to comply with this waiver of +copyright interest. + +## Requesting Changes +Our pull request/merging process is designed to give the CDC Surveillance Team +and other in our space an opportunity to consider and discuss any suggested +changes. This policy affects all CDC spaces, both on-line and off, and all users +are expected to abide by it. + +### Open an issue in the repository +If you don't have specific language to submit but would like to suggest a change +or have something addressed, you can open an issue in this repository. Team +members will respond to the issue as soon as possible. + +### Submit a pull request +If you would like to contribute, please submit a pull request. In order for us +to merge a pull request, it must: + * Be at least seven days old. Pull requests may be held longer if necessary + to give people the opportunity to assess it. + * Receive a +1 from a majority of team members associated with the request. + If there is significant dissent between the team, a meeting will be held to + discuss a plan of action for the pull request. diff --git a/DESCRIPTION b/DESCRIPTION new file mode 100644 index 0000000..8eec2db --- /dev/null +++ b/DESCRIPTION @@ -0,0 +1,15 @@ +Package: SeqSpawnR +Type: Package +Title: Spawn Random DNA Sequences +Version: 0.1.0 +Author: Tony Boyles +Maintainer: Tony Boyles +Description: This package simplifies the process of creating randomly mutated DNA sequences. +License: file LICENSE +Encoding: UTF-8 +LazyData: true +Depends: R (>= 2.10.0) +Imports: + stringr, + magrittr +RoxygenNote: 6.1.1 diff --git a/DISCLAIMER.md b/DISCLAIMER.md new file mode 100644 index 0000000..63fa40c --- /dev/null +++ b/DISCLAIMER.md @@ -0,0 +1,23 @@ +# DISCLAIMER +Use of this service is limited only to **non-sensitive and publicly available +data**. Users must not use, share, or store any kind of sensitive data like +health status, provision or payment of healthcare, Personally Identifiable +Information (PII) and/or Protected Health Information (PHI), etc. under **ANY** +circumstance. + +Administrators for this service reserve the right to moderate all information +used, shared, or stored with this service at any time. Any user that cannot +abide by this disclaimer and Code of Conduct may be subject to action, up to +and including revoking access to services. + +The material embodied in this software is provided to you "as-is" and without +warranty of any kind, express, implied or otherwise, including without +limitation, any warranty of fitness for a particular purpose. In no event shall +the Centers for Disease Control and Prevention (CDC) or the United States (U.S.) +government be liable to you or anyone else for any direct, special, incidental, +indirect or consequential damages of any kind, or any damages whatsoever, +including without limitation, loss of profit, loss of use, savings or revenue, +or the claims of third parties, whether or not CDC or the U.S. government has +been advised of the possibility of such loss, however caused and on any theory +of liability, arising out of or in connection with the possession, use or +performance of this software. diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..8dada3e --- /dev/null +++ b/LICENSE @@ -0,0 +1,201 @@ + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "{}" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright {yyyy} {name of copyright owner} + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. diff --git a/NAMESPACE b/NAMESPACE new file mode 100644 index 0000000..863c4b8 --- /dev/null +++ b/NAMESPACE @@ -0,0 +1,4 @@ +importFrom(magrittr,"%>%") +importFrom(stringr,"str_replace") + +exportPattern("^[[:alpha:]]+") diff --git a/R/SequenceSpawner.R b/R/SequenceSpawner.R new file mode 100644 index 0000000..e735d7c --- /dev/null +++ b/R/SequenceSpawner.R @@ -0,0 +1,102 @@ +#' spawn_sequences +#' +#' @param n The number of sequences to output +#' @param snps The maximum number of snps in each mutant sequence +#' @param seed The original DNA Sequence from which the outputs will be mutated +#' +#' @return A vector of mutant sequences +#' @export +spawn_sequences <- function(n = 20000, snps = 100, seed){ + if(missing(seed)){ + seed <- SeqSpawnR::HXB2 + } + seednChar <- nchar(seed) + + SampleCodons <- c("GCA", "GCC", "GCG", "GCT", "AAC", "AAT", "GAC", "GAT", "TGC", "TGT", "GAC", "GAT", "GAA", "GAG", "TTC", "TTT", "GGA", "GGC", "GGG", "GGT", "CAC", "CAT", "ATA", "ATC", "ATT" ,"AAA", "AAG" ,"CTA", "CTC", "CTG", "CTT", "TTA", "TTG" , "ATG", "AAC", "AAT", "CCA", "CCC", "CCG", "CCT", "CAA", "CAG", "AGA", "AGG", "CGA", "CGC", "CGG", "CGT", "AGC", "AGT", "TCA", "TCC", "TCG", "TCT", "ACA", "ACC", "ACG", "ACT", "GTA", "GTC", "GTG", "GTT", "TGG", "TAC", "TAT", "CAA", "CAG", "GAA", "GAG" ) + + SampleSNPs <- c("A", "C", "G", "T") + + # Set up the vector for the total number of seed variations + sequences = vector(mode = "character", n) + + #Set up vectors to record the history of the Codon string to search for in the seed sequence, and the replacement + # condon string + RandomCodonSetHistory = vector(mode = "character", n) + ReplacementCodonSetHistory = vector(mode = "character", n) + + #Initialize the seed vector with the initial seed sequence as a seed. + sequences[1] <- seed + + x <- 1 + + while(x < n){ + # The second value can be varied for the total number of possible codons that you want to be considered for consecutive + # set of condons for variation + NCodons <- sample(1:10, 1) + + # Of the entire set of codons that represent amino acids randomly select up to NCodons check for existence + # of the codon string in the current seed under consideration. + RandomCodonSet <- paste(sample(SampleCodons, NCodons, replace=TRUE), collapse="") + RandomCodonSetHistory[x] <- RandomCodonSet + + # Do the same and select a codon set of the same length that will replace the searched set. + ReplacementCodonSet <- paste(sample(SampleCodons, NCodons, replace=TRUE), collapse="") + ReplacementCodonSetHistory[x] <- ReplacementCodonSet + + # If there is a match between the RandomCodonSet and the seed in processes, set the control flags, + # and then replace the codon set, update the vector, and increment the count. + if(RandomCodonSet %>% grep(sequences[x], fixed = TRUE) %>% length > 0){ + Newseed <- str_replace(sequences[sample(1:x, 1)], RandomCodonSet, ReplacementCodonSet) + sequences[x+1] <- Newseed + + # Add more SNP substitutions randomly across the entire sequence to newly created variant. Replacement numbers are controlled by randomly + # sampling the AddedSNP number, and randomly picking SNPs to replace. + for (j in 1:(sample(1:snps, 1))){ + RandomSNP <- sample(SampleSNPs, 1); + LocOfSNP <- sample(1:seednChar, 1); + substr(sequences[x+1], LocOfSNP, LocOfSNP) = RandomSNP; # Location of selected SNP to be replaced. + } + x <- x+1 + } + } + return(sequences) +} + +#' Pol Region of the HXB2 Strain of HIV +#' +#' @format A string with 1300 Characters +"HXB2" +#> [1] "HXB2" + +#' write_fasta +#' +#' @param sequences A Vector of Sequences +#' @param filename A Filename +#' +#' @export +write_fasta <- function(sequences, filename){ + if(missing(filename)){ + filename <- Sys.time() %>% + gsub(" ", "-", .) %>% + gsub(":","-", .) %>% + paste(".fasta") %>% + gsub(" ","",.) + } + + time <- Sys.time() + + n <- length(sequences) + + for(i in 1:n){ + time %>% + gsub(":", "_", .) %>% + gsub(" ", "_", .) %>% + paste(">", .) %>% + gsub(" ", "", .) %>% + paste(i) %>% + gsub(" ", "-", .) %>% + paste(sequences[i]) %>% + gsub(" ", "\n", .) %>% + write(file = filename, append = TRUE) + } +} diff --git a/README.md b/README.md new file mode 100644 index 0000000..f7ccec5 --- /dev/null +++ b/README.md @@ -0,0 +1,91 @@ +# SeqSpawnR + +Spawn Random DNA Sequences + +*This repository was created for use by CDC programs to collaborate on public health surveillance related projects in support of the CDC Surveillance Strategy. Github is not hosted by the CDC, but is used by CDC and its partners to share information and collaborate on software.* + +## Install + +```R +devtools::install_github('CDCgov/SeqSpawnR') +``` + +## Use + +```R +HIV_Samples <- SeqSpawnR::spawn_sequences(10) +``` + +`HIV_Samples` will be a vector of mutated HIV sequences. By default, it uses the *pol* region of the HXB2 strain of HIV as the DNA seed. To set the seed to a different DNA Sequence: + +```R +mySeed <- 'GATTACA' +HIV_Samples <- SeqSpawnR::spawn_sequences(10, seed = mySeed) +``` + +Where `mySeed` is a string containing the DNA Sequence you wish to use as your seed. + +If you'd like to write the samples into a fasta file... + +```R +filename <- 'HIV_Samples.fasta' +SeqSpawnR::write_fasta(HIV_Samples, filename) +``` + +## Public Domain +This repository constitutes a work of the United States Government and is not +subject to domestic copyright protection under 17 USC ยง 105. This repository is in +the public domain within the United States, and copyright and related rights in +the work worldwide are waived through the [CC0 1.0 Universal public domain dedication](https://creativecommons.org/publicdomain/zero/1.0/). +All contributions to this repository will be released under the CC0 dedication. By +submitting a pull request you are agreeing to comply with this waiver of +copyright interest. + +## License +The repository utilizes code licensed under the terms of the Apache Software +License and therefore is licensed under ASL v2 or later. + +This source code in this repository is free: you can redistribute it and/or modify it under +the terms of the Apache Software License version 2, or (at your option) any +later version. + +This source code in this repository is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A +PARTICULAR PURPOSE. See the Apache Software License for more details. + +You should have received a copy of the Apache Software License along with this +program. If not, see http://www.apache.org/licenses/LICENSE-2.0.html + +The source code forked from other open source projects will inherit its license. + + +## Privacy +This repository contains only non-sensitive, publicly available data and +information. All material and community participation is covered by the +Surveillance Platform [Disclaimer](https://github.com/CDCgov/template/blob/master/DISCLAIMER.md) +and [Code of Conduct](https://github.com/CDCgov/template/blob/master/code-of-conduct.md). +For more information about CDC's privacy policy, please visit [http://www.cdc.gov/privacy.html](http://www.cdc.gov/privacy.html). + +## Contributing +Anyone is encouraged to contribute to the repository by [forking](https://help.github.com/articles/fork-a-repo) +and submitting a pull request. (If you are new to GitHub, you might start with a +[basic tutorial](https://help.github.com/articles/set-up-git).) By contributing +to this project, you grant a world-wide, royalty-free, perpetual, irrevocable, +non-exclusive, transferable license to all users under the terms of the +[Apache Software License v2](http://www.apache.org/licenses/LICENSE-2.0.html) or +later. + +All comments, messages, pull requests, and other submissions received through +CDC including this GitHub page are subject to the [Presidential Records Act](http://www.archives.gov/about/laws/presidential-records.html) +and may be archived. Learn more at [http://www.cdc.gov/other/privacy.html](http://www.cdc.gov/other/privacy.html). + +## Records +This repository is not a source of government records, but is a copy to increase +collaboration and collaborative potential. All government records will be +published through the [CDC web site](http://www.cdc.gov). + +## Notices +Please refer to [CDC's Template Repository](https://github.com/CDCgov/template) +for more information about [contributing to this repository](https://github.com/CDCgov/template/blob/master/CONTRIBUTING.md), +[public domain notices and disclaimers](https://github.com/CDCgov/template/blob/master/DISCLAIMER.md), +and [code of conduct](https://github.com/CDCgov/template/blob/master/code-of-conduct.md). diff --git a/SeqSpawnR.Rproj b/SeqSpawnR.Rproj new file mode 100644 index 0000000..f0d6187 --- /dev/null +++ b/SeqSpawnR.Rproj @@ -0,0 +1,21 @@ +Version: 1.0 + +RestoreWorkspace: Default +SaveWorkspace: Default +AlwaysSaveHistory: Default + +EnableCodeIndexing: Yes +UseSpacesForTab: Yes +NumSpacesForTab: 2 +Encoding: UTF-8 + +RnwWeave: Sweave +LaTeX: pdfLaTeX + +AutoAppendNewline: Yes +StripTrailingWhitespace: Yes + +BuildType: Package +PackageUseDevtools: Yes +PackageInstallArgs: --no-multiarch --with-keep.source +PackageRoxygenize: rd,collate,namespace,vignette diff --git a/code-of-conduct.md b/code-of-conduct.md new file mode 100644 index 0000000..f657aae --- /dev/null +++ b/code-of-conduct.md @@ -0,0 +1,103 @@ +# Creating a Culture of Innovation +We aspire to create a culture where people work joyfully, communicate openly +about things that matter, and provide great services globally. We would like our +team and communities (both government and private sector) to reflect on +diversity of all kinds, not just the classes protected in law. Diversity fosters +innovation. Diverse teams are creative teams. We need a diversity of perspective +to create solutions for the challenges we face. + +This is our code of conduct (adapted from [18F's Code of Conduct](https://github.com/18F/code-of-conduct)). +We follow all Equal Employment Opportunity laws and we expect everyone we work +with to adhere to the [GSA Anti-harrasment Policy](http://www.gsa.gov/portal/directive/d0/content/512516), +even if they do not work for the Centers for Disease Control and Prevention or +GSA. We expect every user to follow this code of conduct and the laws and +policies mentioned above. + +## Be Empowering +Consider what you can do to encourage and support others. Make room for quieter +voices to contribute. Offer support and enthusiasm for great ideas. Leverage the +low cost of experimentation to support your colleagues' ideas, and take care to +acknowledge the original source. Look for ways to contribute and collaborate, +even in situations where you normally wouldn't. Share your knowledge and skills. +Prioritize access for and input from those who are traditionally excluded from +the civic process. + +## Rules of Behavior + * I understand that I must complete security awareness and records management + training annually in order to comply with the latest security and records + management policies. + * I understand that I must also follow the [Rules of Behavior for use of HHS Information Resources](http://www.hhs.gov/ocio/policy/hhs-rob.html) + * I understand that I must not use, share, or store any kind of sensitive data + (health status, provision or payment of healthcare, PII, etc.) under ANY + circumstance. + * I will not knowingly conceal, falsify, or remove information. + * I understand that I can only use non-sensitive and/or publicly available + data. + * I understand that all passwords I create to set up accounts need to comply + with CDC's password policy. + * I understand that the stewards reserves the right to moderate all data at any + time. + +## Boundaries +Create boundaries to your own behavior and consider how you can create a safe +space that helps prevent unacceptable behavior by others. We can't list all +instances of unacceptable behavior, but we can provide examples to help guide +our community in thinking through how to respond when we experience these types +of behavior, whether directed at ourselves or others. + +If you are unsure if something is appropriate behavior, it probably is not. Each +person we interact with can define where the line is for them. Impact matters +more than intent. Ensuring that your behavior does not have a negative impact is +your responsibility. Problems usually arise when we assume that our way of +thinking or behavior is the norm for everyone. + +### Here are some examples of unacceptable behavior + * Negative or offensive remarks based on the protected classes as listed in the + GSA Anti-harrasment Policy of race, religion, color, sex, national origin, + age, disability, genetric information, sexual orientation, gender identity, + parental status, maritual status, and political affiliation as well as gender + expression, mental illness, socioeconomic status or backgrounds, + neuro(a)typicality, physical appearance, body size, or clothing. Consider + that calling attention to differences can feel alienating. + * Sustained disruption of meetings, talks, or discussions, including chatrooms. + * Patronizing language or behavior. + * Aggresive behavior, such as unconstructive criticism, providing correction + that do not improve the conversation (sometimes referred to as "well + actually's"), repeatedly interrupting or talking over someone else, feigning + surprise at someone's lack of knowledge or awareness about a topic, or subtle + prejudice. + * Referring to people in a way that misidentifies their gender and/or rejects + the validity of their gender identity; for instance by using incorrect + pronouns or forms of address (misgendering). + * Retaliating against anyone who files a formal complaint that someone has + violated these codes or laws. + +## Background +CDC Scientific Clearance is the process of obtaining approvals by appropriate +CDC officials before a CDC information product is released to the public or +CDC's external public health partners. Information products that require formal +clearance include print, electronic, or oral materials, that CDC employees +author or co-author, whether published by CDC or outside CDC. CDC contractors +developing content on behalf of CDC for the public or CDC's external public +health partners are also required to put their content through the formal +clearance process. The collaborative functions related to the projects include +blogs, wikis, forums, bug tracking sites, source control and +others deemed as necessary. + +For those individuals within the CDC, adherence to the following policies are +required: +* CDC ["Clearance of Information Products Disseminated Outside CDC for Public Use"](http://www.cdc.gov/maso/Policy/PublicUse.pdf) +* HHS ["Ensuring the Quality of Information Disseminated by HHS agencies"](http://aspe.hhs.gov/infoquality) + +All collaborative materials will be controlled by the rules contained within +this document. This will allow for the real-time collaboration opportunities +among CDC employees, CDC contractors and CDC public health partners. + +## Credit +This code of conduct was mainly adapted from [18F's Code of Conduct](https://github.com/18F/code-of-conduct) +and the [CDC's Informatics Innovation Unit R&D Lab's code of conduct.](http://www.phiresearchlab.org/?page_id=1715) + +## Relevant Legal Considerations +* [Laws enforced by the Equal Employment Opportunity Commission](http://www.eeoc.gov/laws/statutes/index.cfm) +* [Types of discrimination prohibited by law](http://www.eeoc.gov/laws/types) +* [New and proposed regulations](http://www.eeoc.gov/laws/regulations/index.cfm) diff --git a/data/HXB2.rda b/data/HXB2.rda new file mode 100644 index 0000000..da3fa51 Binary files /dev/null and b/data/HXB2.rda differ diff --git a/man/HXB2.Rd b/man/HXB2.Rd new file mode 100644 index 0000000..87edb6d --- /dev/null +++ b/man/HXB2.Rd @@ -0,0 +1,14 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/SequenceSpawner.R +\docType{data} +\name{HXB2} +\alias{HXB2} +\title{Pol Region of the HXB2 Strain of HIV} +\format{A string with 1300 Characters} +\usage{ +HXB2 +} +\description{ +Pol Region of the HXB2 Strain of HIV +} +\keyword{datasets} diff --git a/man/spawn_sequences.Rd b/man/spawn_sequences.Rd new file mode 100644 index 0000000..086417c --- /dev/null +++ b/man/spawn_sequences.Rd @@ -0,0 +1,21 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/SequenceSpawner.R +\name{spawn_sequences} +\alias{spawn_sequences} +\title{spawn_sequences} +\usage{ +spawn_sequences(n = 20000, snps = 100, seed) +} +\arguments{ +\item{n}{The number of sequences to output} + +\item{snps}{The maximum number of snps in each mutant sequence} + +\item{seed}{The original DNA Sequence from which the outputs will be mutated} +} +\value{ +A vector of mutant sequences +} +\description{ +spawn_sequences +} diff --git a/man/write_fasta.Rd b/man/write_fasta.Rd new file mode 100644 index 0000000..9af9d89 --- /dev/null +++ b/man/write_fasta.Rd @@ -0,0 +1,16 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/SequenceSpawner.R +\name{write_fasta} +\alias{write_fasta} +\title{write_fasta} +\usage{ +write_fasta(sequences, filename) +} +\arguments{ +\item{sequences}{A Vector of Sequences} + +\item{filename}{A Filename} +} +\description{ +write_fasta +}