Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop #17

Open
wants to merge 117 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
117 commits
Select commit Hold shift + click to select a range
e871a26
add ablation
juexinwang Nov 13, 2020
0ce6b2e
add ablation tests on imputation
juexinwang Nov 13, 2020
157a773
add plot
juexinwang Nov 15, 2020
751360a
update dist
juexinwang Nov 15, 2020
0534832
reconstruct
juexinwang Nov 16, 2020
37c1a8b
reconstruct
juexinwang Nov 16, 2020
5f1dfc9
change numpy hist
juexinwang Nov 16, 2020
bc3fc5a
change numpy hist
juexinwang Nov 16, 2020
6b0c313
change numpy hist
juexinwang Nov 16, 2020
fd5c4ae
change numpy hist
juexinwang Nov 16, 2020
4398da5
change fig
juexinwang Nov 16, 2020
8677a70
add r support
juexinwang Nov 17, 2020
a2b27d1
add generating distribution
juexinwang Nov 17, 2020
be58381
add distribution sbatch file
juexinwang Nov 17, 2020
28451f5
update fig
juexinwang Nov 17, 2020
d805f4b
debug
juexinwang Nov 17, 2020
e823a2a
update
juexinwang Nov 17, 2020
d2c8eb9
change orders
juexinwang Nov 17, 2020
c404291
change orders
juexinwang Nov 17, 2020
6cb5bd6
change orders
juexinwang Nov 17, 2020
4b20132
change orders
juexinwang Nov 18, 2020
b72a43b
Add RMSE
juexinwang Nov 18, 2020
5fedb49
add recheck
Nov 22, 2020
00129a4
fix a bug
Nov 22, 2020
f0e7bdf
change to new format
Nov 22, 2020
92912a1
fix a bug
Nov 22, 2020
59d9cc1
add 9 and 11 data
Nov 22, 2020
b04bbb4
add 9 and 11 for recheck
Nov 23, 2020
d0f43ed
fix a bug
Nov 23, 2020
bfdcd91
fix a typo
Nov 23, 2020
f505af3
update converge type
Nov 23, 2020
9f7ae1e
change ranking
Nov 23, 2020
7863f74
add cosine
Nov 24, 2020
e18d291
add magic
Nov 24, 2020
013240c
add bash
Nov 24, 2020
8d313d7
add bash
Nov 24, 2020
bef9ce4
add scvi
Nov 25, 2020
ec3ab73
add scvi
Nov 25, 2020
87b5205
add scvi
Nov 25, 2020
09a1ae3
add asucie
Nov 25, 2020
0ed5e10
tmp dca
Nov 25, 2020
42578da
add dca/deepimpute
Nov 25, 2020
e937902
add deep impute and asucie
Nov 25, 2020
66e2204
update deepimpute
Nov 25, 2020
65b5203
update dca
Nov 25, 2020
10d6387
update deepimpute to raw counts
Nov 25, 2020
d36c62a
update GPU settings
Nov 25, 2020
d8469d6
update saucie
Nov 25, 2020
992580e
update saucie directory
Nov 25, 2020
d9d4932
update saucie directory
Nov 25, 2020
786b79c
update saucie directory
Nov 25, 2020
a9c9683
update saucie directory
Nov 25, 2020
c932614
update saucie directory
Nov 25, 2020
c74ffbd
modify saucie
Nov 26, 2020
1fe435a
add dca update
Nov 26, 2020
3a12bb0
add dca
Nov 26, 2020
068559d
update dca
Nov 26, 2020
c3252d1
update dca
Nov 26, 2020
8760811
add saver in impute
Nov 26, 2020
64458d4
add scimpute in imputation
Nov 26, 2020
583ab13
add scimpute in imputation
Nov 26, 2020
3cf5d3c
add scimpute in imputation
Nov 26, 2020
7eb9a20
add scimpute in imputation
Nov 26, 2020
9f06426
add scimpute in imputation debug
Nov 26, 2020
df944a4
move not using scripts in results to new folder
Nov 27, 2020
4db2a4e
reorganize old codes
Nov 27, 2020
5e4cedf
debugscimpute
Nov 27, 2020
ab1d2c8
fix saver issue
Nov 27, 2020
dec349f
scimpute for all possible scenarios
Nov 27, 2020
9c78428
imputation on other results
Nov 27, 2020
1e30f6f
imputation on other results
Nov 27, 2020
9dcf706
imputation on other results
Nov 27, 2020
d170da2
fix a log error in imputation of scGNN, rerun
Nov 27, 2020
5fa9822
update sbatch infor
Nov 27, 2020
58808fe
Partly add scNMF and scGAN
Nov 28, 2020
49787b6
update a new version of main_benchmark with timer and mem infor
Nov 28, 2020
169864c
ratio 0.0
Nov 28, 2020
9a27b23
ratio 0.0
Nov 28, 2020
3dddc7f
for ratio 0.0
Nov 29, 2020
24c88e6
add figure 3 interactions
Nov 29, 2020
3992609
update print format
Nov 29, 2020
5130c64
update print format in interaction
Nov 29, 2020
85550ff
output results in interaction
Nov 29, 2020
1939008
output results in interaction
Nov 29, 2020
08ef27e
output results in interaction
Nov 29, 2020
afdd7fa
add final mem in scGNN.py
Nov 30, 2020
012a693
update package dependence
Dec 1, 2020
92f43c6
add scIGANs and netNMFsc imputation evaluation
Dec 1, 2020
86c9a72
add figure3, all methods
Dec 1, 2020
cd58ea6
add scIGANs and netNMFsc imputation evaluation
Dec 1, 2020
decbddd
add figure3, all methods
Dec 1, 2020
cb2cec1
add figure3, all methods
Dec 1, 2020
eaa31f2
add figure3, all methods
Dec 1, 2020
0c30fc7
add figure3, all methods
Dec 1, 2020
879d156
fix a typo
Dec 1, 2020
142e91a
fix a typo
Dec 1, 2020
cf553ba
only focus on distribution
Dec 4, 2020
6eb4472
add npy2csv
Dec 8, 2020
d2cdfce
Significant! Now provides GPU! One known bug: exclude r-ltmgscgnn
juexinwang Dec 8, 2020
36f4ccb
add time test for both cpu and gpu
juexinwang Dec 8, 2020
3b0b1e8
add louvain
Dec 9, 2020
d535095
fix a bug
Dec 9, 2020
07d9af1
add benchmark
Dec 9, 2020
4f050f3
add benchmark fw
Dec 10, 2020
35eadb9
‘update’
Dec 10, 2020
696fcce
add all methods
Dec 10, 2020
64e9b37
add all methods
Dec 10, 2020
534ed1b
fix a bug in dca
Dec 10, 2020
68f2e6d
update name
Dec 10, 2020
137b5c6
recheck dca
Dec 10, 2020
98649a8
only use 12/13 for dca
Dec 10, 2020
f3fbc5f
back to full methods
Dec 10, 2020
5e5853a
add zero percentage calculation
Dec 10, 2020
4cf1e7d
add tmp results of celltype
Dec 12, 2020
b1317c5
add saucie plot
Dec 12, 2020
7d2c74a
add netNMF and scIGAN
Dec 27, 2020
89af13c
Create choose_louvain.py
juexinwang Feb 19, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions bak/npy2csv_script.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
import numpy as np
import pandas as pd

def convert(method='dca'):
t=np.load(method+'\\9.Chung_0.0_1_recon.npy')
df = pd.DataFrame(t)
df.to_csv(method+'_9.csv',header=None,index=False)

t=np.load(method+'\\11.Kolodziejczyk_0.0_1_recon.npy')
df = pd.DataFrame(t)
df.to_csv(method+'_11.csv',header=None,index=False)

t=np.load(method+'\\12.Klein_0.0_1_recon.npy')
df = pd.DataFrame(t)
df.to_csv(method+'_12.csv',header=None,index=False)

t=np.load(method+'\\13.Zeisel_0.0_1_recon.npy')
df = pd.DataFrame(t)
df.to_csv(method+'_13.csv',header=None,index=False)

convert('dca')
convert('deepimpute')
convert('magic')
convert('netNMFsc')
convert('saucie')
convert('saver')
convert('scimpute')
convert('scvi')


def convertCSV(method='scIGANs'):
df = pd.read_csv(method+'\\9.Chung_0.0_1_recon.csv.txt',sep='\s+',index_col=0)
df = df.T
df.to_csv(method+'_9.csv',header=None,index=False)

df = pd.read_csv(method+'\\11.Kolodziejczyk_0.0_1_recon.csv.txt',sep='\s+',index_col=0)
df = df.T
df.to_csv(method+'_11.csv',header=None,index=False)

df = pd.read_csv(method+'\\12.Klein_0.0_1_recon.csv.txt',sep='\s+',index_col=0)
df = df.T
df.to_csv(method+'_12.csv',header=None,index=False)

df = pd.read_csv(method+'\\13.Zeisel_0.0_1_recon.csv.txt',sep='\s+',index_col=0)
df = df.T
df.to_csv(method+'_13.csv',header=None,index=False)

convertCSV('scIGANs')


File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
args = parser.parse_args()

# Note:
# Main Check results
# Generate results in python other than in shell for better organization
# We are not use runpy.run_path('main_result.py') for it is hard to pass arguments
# We are not use subprocess.call("python main_result.py", shell=True) for it runs scripts parallel
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
4 changes: 2 additions & 2 deletions results/results_impute.py → bak/results/results_impute.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,8 @@
dropix = np.load(args.npyDir+args.datasetName+'_'+args.regulized_type+discreteStr+'_'+args.ratio+'_dropix.npy')

featuresImpute = np.load(args.npyDir+args.datasetName+'_'+args.regulized_type+discreteStr+'_'+args.ratio+'_recon'+args.reconstr+'.npy')
l1ErrorMean, l1ErrorMedian, l1ErrorMin, l1ErrorMax = imputation_error_log(featuresImpute, featuresOriginal, features, dropi, dropj, dropix)
print('{:.4f} {:.4f} {:.4f} {:.4f} '.format(l1ErrorMean, l1ErrorMedian, l1ErrorMin, l1ErrorMax), end='')
l1ErrorMean, l1ErrorMedian, l1ErrorMin, l1ErrorMax, rmse = imputation_error_log(featuresImpute, featuresOriginal, features, dropi, dropj, dropix)
print('{:.4f} {:.4f} {:.4f} {:.4f} {:.4f} '.format(l1ErrorMean, l1ErrorMedian, l1ErrorMin, l1ErrorMax, rmse), end='')

def imputeResult(inputData):
'''
Expand Down
File renamed without changes.
File renamed without changes.
10 changes: 7 additions & 3 deletions benchmark_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -530,6 +530,7 @@ def imputation_error(X_mean, X, X_zero, i, j, ix):
all_index = i[ix], j[ix]
x, y = X_mean[all_index], X[all_index]
result = np.abs(x - y)
rmse = ((x - y)**2/len(result))**0.5
# If the input is a sparse matrix
else:
all_index = i[ix], j[ix]
Expand All @@ -538,8 +539,9 @@ def imputation_error(X_mean, X, X_zero, i, j, ix):
yuse = scipy.sparse.lil_matrix.todense(y)
yuse = np.asarray(yuse).reshape(-1)
result = np.abs(x - yuse)
rmse = ((x - yuse)**2/len(result))**0.5
# return np.median(np.abs(x - yuse))
return np.mean(result), np.median(result), np.min(result), np.max(result)
return np.mean(result), np.median(result), np.min(result), np.max(result), np.mean(rmse)


# IMPUTATION METRICS
Expand All @@ -562,6 +564,7 @@ def imputation_error_log(X_mean, X, X_zero, i, j, ix):
all_index = i[ix], j[ix]
x, y = X_mean[all_index], X[all_index]
result = np.abs(x - np.log(y+1))
rmse = ((x - np.log(y+1))**2/len(result))**0.5
# If the input is a sparse matrix
else:
all_index = i[ix], j[ix]
Expand All @@ -570,10 +573,11 @@ def imputation_error_log(X_mean, X, X_zero, i, j, ix):
yuse = scipy.sparse.lil_matrix.todense(y)
yuse = np.asarray(yuse).reshape(-1)
result = np.abs(x - np.log(yuse+1))
rmse = ((x - np.log(yuse+1))**2/len(result))**0.5
# return np.median(np.abs(x - yuse))
return np.mean(result), np.median(result), np.min(result), np.max(result)
return np.mean(result), np.median(result), np.min(result), np.max(result), np.mean(rmse)

# cosine similarity
# cosine similarity with log
def imputation_cosine_log(X_mean, X, X_zero, i, j, ix):
"""
X_mean: imputed dataset
Expand Down
71 changes: 71 additions & 0 deletions codesfromJGandYJ/codeForCellcluster/Run_netNMF_celltype.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# This code has not cleaned yet
# run netNMF-sc from command line and save outputs to specified directory
from __future__ import print_function
import numpy as np
from warnings import warn
from joblib import Parallel, delayed
import copy,argparse,os,math,random,time
from scipy import sparse, io,linalg
from scipy.sparse import csr_matrix
import warnings,os
from netNMFsc import plot
warnings.simplefilter(action='ignore', category=FutureWarning)
import pandas as pd

def main(args):
if args.method == 'GD':
from netNMFsc import netNMFGD
operator = netNMFGD(d=args.dimensions, alpha=args.alpha, n_inits=1, tol=args.tol, max_iter=args.max_iters, n_jobs=1)
elif args.method == 'MU':
from netNMFsc import netNMFMU
operator = netNMFMU(d=args.dimensions, alpha=args.alpha, n_inits=1, tol=args.tol, max_iter=args.max_iters, n_jobs=1)


chung = pd.read_csv(args.filename, header=0,
index_col=0, sep=',')
X = chung.values
genes = []
for gen in chung.index.values:
if '.' in gen:
genes.append(gen.upper().split('.')[0])
else:
genes.append(gen.upper())
#print(genes)
operator.X = X
operator.genes = np.asarray(genes)
#operator.load_10X(direc=args.tenXdir,genome='mm10')
operator.load_network(net=args.network,genenames=args.netgenes,sparsity=args.sparsity)
dictW = operator.fit_transform()
W, H = dictW['W'], dictW['H']
k,clusters = plot.select_clusters(H,max_clusters=20)
plot.tSNE(H,clusters,fname=args.direc + '/netNMFsc_tsne')
os.system('mkdir -p %s'%(args.direc))
np.save(os.path.join(args.direc,'W.npy'),W)
np.save(os.path.join(args.direc,'H.npy'),H)
np.save(os.path.join(args.direc, 'cluster.npy'), clusters)
return
#/storage/htc/joshilab/jghhd/singlecellTest/netNMFsc/netNMF-sc/netNMFsc/refdata/

if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("-m","--method",help="either 'GD for gradient descent or MU for multiplicative update",type=str,default='GD')
parser.add_argument("-f","--filename", help="path to data file (.npy or .mtx)",type=str,default='matrix.mtx')
parser.add_argument("-g","--gene_names", help="path to file containing gene names (.npy or .tsv)",type=str,default='gene_names.tsv')
parser.add_argument("-net","--network", help="path to network file (.npy or .mtx)",type=str,default='')
parser.add_argument("-netgenes","--netgenes", help="path to file containing gene names for network (.npy or .tsv)",type=str,default='')
parser.add_argument("-org","--organism", help="mouse or human",type=str,default='human')
parser.add_argument("-id","--idtype", help="ensemble, symbol, or entrez",type=str,default='ensemble')
parser.add_argument("-netid","--netidtype", help="ensemble, symbol, or entrez",type=str,default='entrez')
parser.add_argument("-n","--normalize", help="normalize data? 1 = yes, 0 = no",type=int,default=0)
parser.add_argument("-sparse","--sparsity", help="sparsity for network",type=float,default=0.99)
parser.add_argument("-mi","--max_iters", help="max iters for netNMF-sc",type=int,default=1500)
parser.add_argument("-t","--tol", help="tolerence for netNMF-sc",type=float,default=1e-2)
parser.add_argument("-d","--direc", help="directory to save files",default='')
parser.add_argument("-D","--dimensions", help="number of dimensions to apply shift",type=int,default = 10)
parser.add_argument("-a","--alpha", help="lambda param for netNMF-sc",type=float,default = 1.0)
parser.add_argument("-x","--tenXdir", help="data is from 10X. Only required to provide directory containing matrix.mtx, genes.tsv, barcodes.tsv files",type=str,default = '')
args = parser.parse_args()
main(args)


#'/storage/htc/joshilab/jghhd/singlecellTest/Data/11.Kolodziejczyk/Use_expression.csv'
82 changes: 0 additions & 82 deletions codesfromJGandYJ/impute code/MAGIC_impute.py

This file was deleted.

56 changes: 0 additions & 56 deletions codesfromJGandYJ/impute code/SAVER_impute.py

This file was deleted.

56 changes: 0 additions & 56 deletions codesfromJGandYJ/impute code/SCIMPUTE.py

This file was deleted.

Loading