Skip to content

Commit

Permalink
Merge pull request #26 from WenjieDu/dev
Browse files Browse the repository at this point in the history
Merge dev into main
  • Loading branch information
WenjieDu authored Dec 6, 2022
2 parents 6927c5d + a7c9c12 commit f947a5e
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 19 deletions.
8 changes: 6 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
</a>
<!-- Coveralls report -->
<a alt='Coveralls report' href='https://coveralls.io/github/WenjieDu/PyPOTS'>
<img src='https://coveralls.io/repos/github/WenjieDu/PyPOTS/badge.svg'>
<img src='https://img.shields.io/coverallsCoverage/github/WenjieDu/PyPOTS?branch=main&logo=coveralls&labelColor=3F5767'>
</a>
<!-- PyPI download number -->
<a alt='PyPI download number' href='https://pepy.tech/project/pypots'>
Expand All @@ -36,6 +36,10 @@
<a alt='CODE_OF_CONDUCT' href='CODE_OF_CONDUCT.md'>
<img src='https://img.shields.io/badge/Contributor%20Covenant-v2.1-4baaaa.svg'>
</a>
<!-- Slack Workspace -->
<a alt='Slack Workspace' href='https://join.slack.com/t/pypots-dev/shared_invite/zt-1gq6ufwsi-p0OZdW~e9UW_IA4_f1OfxA'>
<img src='https://img.shields.io/badge/Slack-PyPOTS-grey?logo=slack&labelColor=4A154B&color=62BCE5'>
</a>
</p>

⦿ `Motivation`: Due to all kinds of reasons like failure of collection sensors, communication error, and unexpected malfunction, missing values are common to see in time series from the real-world environment. This makes partially-observed time series (POTS) a pervasive problem in open-world modeling and prevents advanced data analysis. Although this problem is important, the area of data mining on POTS still lacks a dedicated toolkit. PyPOTS is created to fill in this blank.
Expand Down Expand Up @@ -85,7 +89,7 @@ or
## ❖ Attention 👀
The documentation and tutorials are under construction. And a short paper introducing PyPOTS is on the way! 🚀 Stay tuned please!

‼️ PyPOTS is currently under developing. If you like it and look forward to its growth, <ins>please give PyPOTS a star and watch it to keep you posted on its progress and to let me know that its development is meaningful</ins>. If you have any feedback, or want to contribute ideas/suggestions or share time-series related algorithms/papers, please join PyPOTS community and <a alt='GitHub Discussions' href='https://github.com/WenjieDu/PyPOTS/discussions'><img align='center' src='https://img.shields.io/badge/Chat-in_Discussions-green?logo=github&color=60A98D'></a>, or create an issue.
‼️ PyPOTS is currently under developing. If you like it and look forward to its growth, <ins>please give PyPOTS a star and watch it to keep you posted on its progress and to let me know that its development is meaningful</ins>. If you have any feedback, or want to contribute ideas/suggestions or share time-series related algorithms/papers, please join PyPOTS community and chat on <a alt='Slack Workspace' href='https://join.slack.com/t/pypots-dev/shared_invite/zt-1gq6ufwsi-p0OZdW~e9UW_IA4_f1OfxA'><img align='center' src='https://img.shields.io/badge/Slack-PyPOTS-grey?logo=slack&labelColor=4A154B&color=62BCE5'></a>, or create an issue. If you have any additional questions or have interests in collaboration, please take a look at [my GitHub profile](https://github.com/WenjieDu) and feel free to contact me 😃.

Thank you all for your attention! 😃

Expand Down
2 changes: 1 addition & 1 deletion pypots/data/load_specific_datasets.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ def preprocess_physionet2012(data):
def apply_func(df_temp): # pad and truncate to set the max length of samples as 48
missing = list(set(range(0, 48)).difference(set(df_temp["Time"])))
missing_part = pd.DataFrame({"Time": missing})
df_temp = df_temp.append(missing_part, ignore_index=False, sort=False) # pad
df_temp = pd.concat([df_temp, missing_part], ignore_index=False, sort=False) # pad
df_temp = df_temp.set_index("Time").sort_index().reset_index()
df_temp = df_temp.iloc[:48] # truncate
return df_temp
Expand Down
16 changes: 0 additions & 16 deletions pypots/tests/unified_data_for_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
# Created by Wenjie Du <[email protected]>
# License: GLP-v3

import pandas as pd
import torch
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
Expand Down Expand Up @@ -75,21 +74,6 @@ def gene_physionet2012():
# generate samples
df = load_specific_dataset("physionet_2012")
X = df["X"]
X = X.drop(df["static_features"], axis=1)

def apply_func(df_temp):
missing = list(set(range(0, 48)).difference(set(df_temp["Time"])))
missing_part = pd.DataFrame({"Time": missing})
df_temp = df_temp.append(missing_part, ignore_index=False, sort=False)
df_temp = df_temp.set_index("Time").sort_index().reset_index()
df_temp = df_temp.iloc[:48]
return df_temp

X = X.groupby("RecordID").apply(apply_func)
X = X.drop("RecordID", axis=1)
X = X.reset_index()
X = X.drop(["level_1", "Time"], axis=1)

y = df["y"]
all_recordID = X["RecordID"].unique()
train_set_ids, test_set_ids = train_test_split(all_recordID, test_size=0.2)
Expand Down

0 comments on commit f947a5e

Please sign in to comment.