Skip to content

Commit

Permalink
docs
Browse files Browse the repository at this point in the history
  • Loading branch information
sprivite committed Nov 14, 2024
1 parent b85ab50 commit f10be20
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 6 deletions.
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Implementing observational studies using real-world data (RWD) is challenging, r
PhenEx (Automated Phenotype Extraction) fills this gap. PhenEx is a Python-based software package that provides reusuable and end-to-end tested implementations of commonly performed operations in the implementation of observational studies. The main advantages of PhenEx are:

- **Arbitrarily complex medical definitions**: Build medical definitions that depend on diagnoses, labs, procedures, and encounter context, as well as on other medical definitions
- **Data-model agnostic**: Work with almost any RWD dataset with only extremely minimal mappings required. Only map the data needed for the study execution.
- **Data-model agnostic**: Work with almost any RWD dataset with only extremely minimal mappings. Only map the data needed for the study execution. Use the ontologies native to your dataset.
- **Portable**: Built on top of [ibis](https://ibis-project.org/), PhenEx works with any backend that ibis supports, including snowflake, PySpark and many more!
- **Intuitive interface**: Study specification in PhenEx mirrors plain language description of the study.
- **High test coverage**: Full confidence answer is correct.
Expand Down
2 changes: 1 addition & 1 deletion docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Coming soon!
To install from source, run the following from within your virtual environment:

```
git clone git@github.com:Bayer-Group/PhenEx.git && \
git clone https://github.com/Bayer-Group/PhenEx.git && \
cd PhenEx && \
pip install -r requirements.txt && \
pip install .
Expand Down
8 changes: 4 additions & 4 deletions phenex/sim.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
from dataclasses import asdict


def generate_fake_data(n_patients: int, domains: DomainsDictionary) -> Dict[str, pd.DataFrame]:
def generate_mock_mapped_tables(n_patients: int, domains: DomainsDictionary) -> Dict[str, pd.DataFrame]:
"""
Generate fake data for N patients based on the given domains.
Expand All @@ -24,11 +24,9 @@ def generate_fake_data(n_patients: int, domains: DomainsDictionary) -> Dict[str,
if "DATE" in col:
start_date = pd.to_datetime('2000-01-01')
end_date = pd.to_datetime('2020-12-31')
data[col] = pd.to_datetime(np.random.randint(start_date.value, end_date.value, n_patients))
data[col] = pd.to_datetime(np.random.randint(start_date.value, end_date.value, n_patients)).date
elif "ID" in col:
data[col] = np.arange(1, n_patients + 1)
elif "CODE" in col:
data[col] = np.random.choice(['A', 'B', 'C', 'D'], n_patients)
elif "VALUE" in col:
data[col] = np.random.uniform(0, 100, n_patients)
elif "CODE_TYPE" in col:
Expand All @@ -40,6 +38,8 @@ def generate_fake_data(n_patients: int, domains: DomainsDictionary) -> Dict[str,
data[col] = np.random.choice(['CPT', 'HCPCS'], n_patients)
else:
data[col] = np.random.choice(['TYPE1', 'TYPE2'], n_patients)
elif "CODE" in col:
data[col] = np.random.choice(['A', 'B', 'C', 'D', 'E', 'F', 'G'], n_patients)
else:
data[col] = np.random.choice(range(1000), n_patients)
fake_data[domain] = pd.DataFrame(data)
Expand Down

0 comments on commit f10be20

Please sign in to comment.