Skip to content

Commit

Permalink
Develop (#64)
Browse files Browse the repository at this point in the history
* add atmos ETL process
ETL DAG μΆ”κ°€
DB engineκ΄€λ ¨ 였λ₯˜ λ°œμƒ μˆ˜μ • 쀑

* Add new train process

MLflow & Prefect & Ray tune을 μ μš©ν•œ μƒˆλ‘œμš΄ training process μž…λ‹ˆλ‹€.

* Add save best model logic

μ‹€ν—˜μ΄ λλ‚œ 뒀에 κ°€μž₯ 쒋은 λͺ¨λΈμ„ μ°Ύμ•„ db에 κ·Έ 정보λ₯Ό κΈ°λ‘ν•©λ‹ˆλ‹€.

이후 predictμ—μ„œλŠ” db쑰회둜 κ°€μž₯ 쒋은 λͺ¨λΈμ„ μ°Ύμ•„ predictλ₯Ό μˆ˜ν–‰ν•©λ‹ˆλ‹€.

* modify atmos ETL pipeline
1. 데이터 μš”μ²­ ν›„ λ°˜ν™˜λ°›μ€ 데이터가 없을 μ‹œ flowλ₯Ό μ’…λ£Œν•˜λ„λ‘ μ„€μ •ν•˜μ˜€μŠ΅λ‹ˆλ‹€.
2. 데이터 검증 ν›„ 쑰건에 λ§žμ§€ μ•ŠλŠ” 데이터가 μžˆμ„ μ‹œ μ €μž₯ν•˜μ§€ μ•Šκ³  flowλ₯Ό μ’…λ£Œν•˜λ„λ‘ ν•˜μ˜€μŠ΅λ‹ˆλ‹€.
3. cron schedule을 μ„€μ •ν•˜λŠ” μ½”λ“œλ₯Ό μΆ”κ°€ν•˜μ…¨μŠ΅λ‹ˆλ‹€.

* Fix save model logic

bestλͺ¨λΈμ„ db에  κΈ°λ‘ν• λ•Œ artifact_uri λ₯Ό μ €μž₯ν•˜λŠ”λ° artifact_path도 ν¬ν•¨ν•˜μ—¬ μ €μž₯ν•©λ‹ˆλ‹€.

* Add redis

redisλ₯Ό μ΄μš©ν•΄μ„œ λΉ„νš¨μœ¨μ μΈ μ½μ–΄μ˜΄μ„ κ°œμ„ ν–ˆμŠ΅λ‹ˆλ‹€.

* Add mnist training

mnist classification training processλ₯Ό μΆ”κ°€ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* add training model process after data ETL
데이터 μˆ˜μ§‘μ΄ μ„±κ³΅μ μœΌλ‘œ μ’…λ£Œλ˜λ©΄ μˆ˜μ§‘λœ 데이터λ₯Ό ν¬ν•¨ν•˜μ—¬ ν•™μŠ΅μ„ μ§„ν–‰ν•˜κ³  μ„±λŠ₯이 μ’‹μœΌλ©΄ λͺ¨λΈμ„ κ΅μ²΄ν•˜λŠ” 과정을 μΆ”κ°€ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Fix save logic

μ €μž₯λ•Œ  run_idλ₯Ό μ €μž₯ ν•˜λŠ”κ²ƒμœΌλ‘œ μˆ˜μ •ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Fix save logic

return true falseλ₯Ό μœ„ν•΄μ„œ 둜직 μˆ˜μ •μ€‘μ— μžˆμŠ΅λ‹ˆλ‹€.

* Add knn model train & save

knn λͺ¨λΈμ„ ν•™μŠ΅ν•˜κ³  μ €μž₯ν•©λ‹ˆλ‹€.
train_df λŠ” κ³ μ •λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€.
knn λͺ¨λΈμ„ cnn λͺ¨λΈμ΄ μ—…λ°μ΄νŠΈ 된 경우만 ν•™μŠ΅μ„ν•˜κ³  μ €μž₯ν•˜κ²Œλ©λ‹ˆλ‹€.
db에 μ €μž₯ν•˜λŠ” 것은 κ°€μž₯ μ΅œκ·Όμ— logging된 knn λͺ¨λΈμ„ μ €μž₯ν•˜κ²Œ λ©λ‹ˆλ‹€.

* add redis caching& modify load model process
1. predict API에 redisλ₯Ό μ΄μš©ν•˜μ—¬ λͺ¨λΈμ„ μΊμ‹œν•˜λŠ” 방법을 μΆ”κ°€ν•˜μ˜€μŠ΅λ‹ˆλ‹€. λͺ¨λΈμ΄ redis에 μ‘΄μž¬ν•˜μ§€ μ•ŠμœΌλ©΄ databaseμ—μ„œ λ°›μ•„μ™€μ„œ redis에 μ €μž₯ν•œ ν›„ μΌμ •μ‹œκ°„ λ™μ•ˆ ν•΄λ‹Ή λͺ¨λΈμ— λŒ€ν•œ μ˜ˆμΈ‘μš”μ²­μ΄ μ—†μœΌλ©΄ μ‚­μ œν•©λ‹ˆλ‹€.
2. artifact path λŒ€μ‹  run idλ₯Ό μ‚¬μš©ν•˜μ—¬ λͺ¨λΈμ„ 뢈러였게 λ³€κ²½ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Modify predict redis
μ˜€μž‘λ™ν•˜λŠ”λΆ€λΆ„ μˆ˜μ •

* Add redis update time logic
1. redis둜 λͺ¨λΈμ„ μΊμ‹±ν•˜λŠ” μ•Œκ³ λ¦¬μ¦˜μ„ κ°œμ„ ν•˜μ˜€μŠ΅λ‹ˆλ‹€.
기쑴방식: λͺ¨λΈμ„ μΊμ‹±ν•œ ν›„ μΌμ •μ‹œκ°„μ΄ μ§€λ‚˜λ©΄ μ‚­μ œλ˜κ²Œ ν•˜μ˜€μŠ΅λ‹ˆλ‹€.
κ°œμ„ λ°©μ‹: μΊμ‹±λœ λͺ¨λΈμ΄ μ˜ˆμΈ‘μš”μ²­μ„ λ°›μœΌλ©΄ λ§Œλ£Œλ˜λŠ” μ‹œκ°„μ΄ μ΄ˆκΈ°ν™”λ˜κ²Œ ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Add mnist prediction route

mnist λͺ¨λΈμ„ λΆˆλŸ¬μ™€μ„œ μ˜ˆμΈ‘ν•˜λŠ” routeλ₯Ό μΆ”κ°€ν•˜μ˜€μŠ΅λ‹ˆλ‹€.
μž„μ‹œμ μœΌλ‘œ 가지고 있던 νŒŒμΌμ—μ„œ predictλ₯Ό μ§„ν–‰ν•˜μ§€λ§Œ μΆ”ν›„ input값을 λ°›μ•„μ„œ μ˜ˆμΈ‘ν•˜λ„λ‘ μˆ˜μ • μ˜ˆμ •μž…λ‹ˆλ‹€.

++ model training process μ—μ„œ jit scriptλ₯Ό μ‚¬μš©ν•˜μ—¬μ„œ λΆˆλŸ¬μ™€ μ‚¬μš©ν•˜λŠ”λ°μ— 문제 μ—†κ²Œ ν•˜μ˜€μŠ΅λ‹ˆλ‹€.
++ training processμ—μ„œ task파일 μ•ˆμ— model classκ°€ μ‘΄μž¬ν•˜κ²Œλ˜λ©΄ model μ‚¬μš©μ— μ—λŸ¬κ°€ λ°œμƒν•΄ λΆ„λ¦¬ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Add redis at mnist prediction

redis μ‚¬μš©μ— μžˆμ–΄μ„œ pickle둜 밀어넣을 수 μ—†μ–΄μ„œ save_to_buffer 둜 λ„£κ³  bytes둜 μ½μ–΄μ˜΅λ‹ˆλ‹€.

* Add redis connection pool
redis의 connection pool을 λ„μž…ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Fix data load & cleanup

ν•˜λ“œμ½”λ”©λ˜μ–΄μžˆλ˜ 뢀뢄을 쑰금 μˆ˜μ •ν–ˆμŠ΅λ‹ˆλ‹€. & λΆˆλŸ¬μ˜€λŠ” λ‘œμ§μ„ μž¬μ‚¬μš©μ„±μ„ λ†’μ΄κΈ°μœ„ν•΄ μˆ˜μ •ν–ˆμŠ΅λ‹ˆλ‹€.

* Fix data path

data λ₯Ό storageμ—μ„œ μ½μ–΄μ˜€λŠ”κ²ƒμœΌλ‘œ λ³€κ²½ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Modify atmos predict api
1. redisμ—μ„œ MLλͺ¨λΈμ„ λΆˆλŸ¬μ˜€λŠ” μ½”λ“œκ°€ μ€‘μ²©λ˜μ–΄ 있던 뢀뢄을 μ‚­μ œν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Test performance according to method
1. postgres DBμ—μ„œ λͺ¨λΈ λ‘œλ“œ, redisμ—μ„œ λͺ¨λΈ λ‘œλ“œ, serialize, deserialize κ°κ°μ—μ„œ μ‹œκ°„μ΄ μ–Όλ§ˆλ‚˜ κ±Έλ¦¬λŠ”μ§€ ν…ŒμŠ€νŠΈ ν•˜κΈ° μœ„ν•˜μ—¬ μ‹œκ°„ μΈ‘μ • μ½”λ“œλ₯Ό μΆ”κ°€ν•΄ λ‘μ—ˆμŠ΅λ‹ˆλ‹€.

* Modify caching algorithm
1. λͺ¨λΈμ„ redisλ‚˜ 기타 DB에 cachingν•˜μ§€ μ•ŠλŠ” λ°©λ²•μœΌλ‘œ λ³€κ²½ν•˜μ˜€μŠ΅λ‹ˆλ‹€. model_timer classλ₯Ό λ§Œλ“€μ–΄ instance variable둜 μ €μž₯ν•˜κ³  일정 μ‹œκ°„μ΄ μ§€λ‚˜λ©΄ μ‚­μ œλ˜κ²Œ ν•˜μ˜€μŠ΅λ‹ˆλ‹€. λͺ¨λΈμ„ 직렬화할 ν•„μš”μ„±μ΄ μ—†μ–΄μ Έ inference 속도가 크게 κ°œμ„ λ˜μ—ˆμŠ΅λ‹ˆλ‹€.
2. cachingλ˜μ§€ μ•Šμ€ λͺ¨λΈμ„ mlflowμ—μ„œ 처음 λΆˆλŸ¬μ˜€λŠ” 뢀뢄은 μ—¬μ „νžˆ λŠλ¦½λ‹ˆλ‹€.

* Modify cache class
1. cache classλ₯Ό μ’€ 더 λ²”μš©μ μΈ μ΄λ¦„μœΌλ‘œ λ³€κ²½ν•˜μ˜€μŠ΅λ‹ˆλ‹€.
2. κΈ°λŠ₯μƒμ˜ 변경은 μ—†μŠ΅λ‹ˆλ‹€.
3. predict methodλ₯Ό μ‚­μ œν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Fix mnist model caching

redis μ‚¬μš©ν•΄μ„œ μ½μ–΄μ˜€λ˜ μƒν™©μ—μ„œ μ½”λ“œλ₯Ό κ°œμ„ ν–ˆμŠ΅λ‹ˆλ‹€.
redisλ₯Ό μ‚¬μš©ν•΄ λͺ¨λΈμ„ μΊμ‹±ν•˜λ €λ©΄ serialize, deserialize ν•˜λŠ” 과정이 ν•„μš”ν•©λ‹ˆλ‹€.
κ·Έ κ³Όμ •μ—μ„œ μ‹œκ°„μ΄  λ„ˆλ¬΄ 였래 걸리기 λ•Œλ¬Έμ— κ°œμ„ μ˜ ν•„μš”μ„±μ„ 느껴 직접 κ΄€λ¦¬ν•˜λŠ” μ½”λ“œλ₯Ό μ‚¬μš©ν•©λ‹ˆλ‹€.
λͺ¨λΈμ„ μ½μ–΄μ˜€κ³  μΊμ‹±ν•˜λŠ” 뢀뢄을 class둜 λ¬Άμ–΄μ„œ λͺ¨μ•„λ†“μ•˜μŠ΅λ‹ˆλ‹€.

* Fix load model load logic

model loadν•˜κ³  μΊμ‹±ν•˜λŠ” λΆ€λΆ„μ—μ„œ lock을 κ±Έκ³  μ§„ν–‰ν•©λ‹ˆλ‹€.
data도 μž„μ‹œλ‘œ μΊμ‹±ν•΄μ„œ μ‚¬μš©ν•˜λ„λ‘ ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Remove unnecessary code

testλ₯Ό μœ„ν•œ μ½”λ“œλ₯Ό μ‚­μ œν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Delete redis
λ ˆλ””μŠ€ μ‚­μ œν•˜μ˜€μŠ΅λ‹ˆλ‹€.
λΉ„λ™κΈ°ν•¨μˆ˜ 잘λͺ» μž‘μ„±λœ λΆ€λΆ„ μˆ˜μ •ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Update README

* Add prefect working directory
1. prefect agent의 μ‹€ν–‰ μœ„μΉ˜μ— 따라 경둜λ₯Ό 찾을 수 μ—†λŠ” 문제λ₯Ό ν•΄κ²°ν•˜μ˜€μŠ΅λ‹ˆλ‹€.
2. flow에 working directoryλ₯Ό λͺ…μ‹œν•΄μ£Όλ©΄ λ©λ‹ˆλ‹€.

* Add more metrics to Mnist

Mnist trainκ³Όμ •μ—μ„œ class별 accuracyλ₯Ό μΈ‘μ •ν•  수 μžˆλ„λ‘ μΆ”κ°€ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Add pipeline option

μ—¬λŸ¬ νŒŒμ΄ν”„λΌμΈμ΄ λ™μ‹œμ— μ μš©λ μˆ˜μžˆλ„λ‘ insurance pipeline도 μ—…λ°μ΄νŠΈν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Update insurance save logic

insurance 뢀뢄도 λ‹€λ₯Έκ²ƒλ“€κ³Ό λ§ˆμ°¬κ°€μ§€λ‘œ run_idλ₯Ό μ €μž₯ν•˜λŠ” ν˜•μ‹μœΌλ‘œ μ—…λ°μ΄νŠΈ ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Add insuracne predict

insurance predictν•˜λŠ”κ²ƒλ„ λ‹€λ₯Έκ²ƒκ³Ό λ™μΌν•œ λ°©λ²•μœΌλ‘œ μ§„ν–‰ν•©λ‹ˆλ‹€.

* Fix task decorator

test와쀑에 μ£Όμ„μ²˜λ¦¬λ˜μ—ˆλ˜ @taskλ₯Ό μ£Όμ„ν•΄μ œν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Remove resource_per_trial

resource_per_trial을 λͺ…μ‹œν–ˆμ„λ•Œ 계속 pendingμƒνƒœμ— 머무λ₯΄λŠ” ν˜„μƒμ΄ μžˆμ–΄ μ‚­μ œν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Fix mlflow-url

mlflow default url을 μˆ˜μ •ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Add git action to build docker containers
1. λ„μ»€νŒŒμΌμ„ λΉŒλ“œν•˜κΈ° μœ„ν•œ κΉƒ μ•‘μ…˜μ„ μΆ”κ°€ν•˜μ˜€μŠ΅λ‹ˆλ‹€.
2. 일뢀 잘λͺ» μ„€μ •λ˜μ–΄μžˆλ˜ 호슀트 이름을 μˆ˜μ •ν•˜μ˜€μŠ΅λ‹ˆλ‹€.
3. requirements.txtλ₯Ό μ΅œμ‹ ν™” ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Remove PR condition

Build API server container ν•˜λŠ” λΆ€λΆ„μ—μ„œ PRμΌλ•Œ 상황은 μ œκ±°ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Modify mnist prediction

Mnist prediction 뢀뢄을 input을 λ°›μ•„μ„œ μ˜ˆμΈ‘ν•˜λ„λ‘ λ°”κΎΈμ—ˆμŠ΅λ‹ˆλ‹€.
run_in_threadpool둜 predict뢀뢄을 λ¬Άμ–΄λ‘μ—ˆμŠ΅λ‹ˆλ‹€.
return을 μ„ΈλΆ„ν™” ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Add kubernetes yaml files
1. μΏ λ²„λ„€ν‹°μŠ€ μ„€μ •νŒŒμΌλ“€μ„ μΆ”κ°€ν•˜μ˜€μŠ΅λ‹ˆλ‹€.
2. prefect_Dockerfile을 μˆ˜μ •ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Remove testing branch

test용으둜 on push 트리거 λΈŒλžœμΉ˜μ—  feature/kubernetes 도 ν¬ν•¨μ‹œμΌœλ‘μ–΄μ„œ μ œμ™Έν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Add deprecated folder

deprecated 된 것듀은 폴더에 λͺ¨μ•„μ„œ κΈ°λ‘ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Delete experiments and import train code
1. ν˜„μž¬ μ‚¬μš©ν•˜μ§€ μ•ŠλŠ” experiments 폴더λ₯Ό μ‚­μ œν•˜μ˜€μŠ΅λ‹ˆλ‹€.
2. main.pyμ—μ„œ train apiλ₯Ό importν•˜μ—¬ μ—λŸ¬κ°€ λ‚˜λ˜ 뢀뢄을 μˆ˜μ •ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Modify Dockerfile name
1. κΈ°μ‘΄ xxx_Dockerfile ν˜•μ‹μœΌλ‘œ λ˜μ–΄μžˆλ˜ 파일 이름을 Dockerfile.xxx ν˜•μ‹μœΌλ‘œ λ°”κΎΈμ—ˆμŠ΅λ‹ˆλ‹€. μ΄λŸ¬ν•œ ν˜•μ‹μœΌλ‘œ μ €μž₯ν•˜λ©΄ 파일의 λͺ©λ‘μ„ 좜λ ₯ν–ˆμ„ λ•Œ λ„μ»€νŒŒμΌμ΄ λͺ¨μ—¬μžˆκ²Œ λ˜λ―€λ‘œ 가독성이 올라갈 κ²ƒμœΌλ‘œ μƒκ°λ©λ‹ˆλ‹€.

* Add load type

Model load 방식을 μ΅œκ³ μ„±λŠ₯ λͺ¨λΈμ„ κ°€μ Έμ˜€λŠ”κ²ƒ 외에
production으둜 λ“±λ‘λœ λͺ¨λΈμ„ κ°€μ Έμ˜€λŠ” 방식을 μΆ”κ°€ν•˜μ˜€μŠ΅λ‹ˆλ‹€.
mlflow둜 λͺ¨λΈ production 및 staging을 관리할 κ²½μš°μ— 이 방법이 μ‚¬μš©λ  수 μžˆμŠ΅λ‹ˆλ‹€.

* Fix data load logic

dataλ‘œλ“œλ₯Ό ν™˜κ²½λ³€μˆ˜μ—μ„œ 경둜λ₯Ό μ½μ–΄μ„œ ν•˜μ§€μ•Šκ³  dbμ—μ„œ 버전별
그리고 μ‹€ν—˜λ³„λ‘œ 읽을 수 μžˆλ„λ‘ ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Modify Mnist Train

좜λ ₯μΈ΅ 이전 λ ˆμ΄μ–΄μ—μ„œ 64개의 νŠΉμ§•μ„ μΆ”μΆœν•©λ‹ˆλ‹€.
model의 xai νŠΉμ§•μ„ λ§Œλ“€λ•Œ 좜λ ₯측을 μ œκ±°ν•˜μ§€ μ•Šμ€ λͺ¨λΈ μ˜ˆμΈ‘λ„ ν•¨κ»˜ νŠΉμ§•μœΌλ‘œ μ‚¬μš©ν•΄μ„œ 74개의 νŠΉμ§•μ„ knn으둜 ν•™μŠ΅ν•©λ‹ˆλ‹€.

* Modify Mnist Predict

trainλΆ€λΆ„μ˜ 변화에 λŒ€μ‘ν•΄ μˆ˜μ •λ˜μ—ˆμŠ΅λ‹ˆλ‹€.

* Add mnist avg metadata

pixel 평균값에 λŒ€ν•œ 정보λ₯Ό μΆ”κ°€ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

prediction λ•Œμ—λ„ input으둜 λ“€μ–΄μ˜¨ λ°μ΄ν„°μ—λŒ€ν•œ pixel 평균값을 λ‘œκΉ…ν•©λ‹ˆλ‹€.

* Add is_cloud parameter

is_cloud parameterκ°€ λˆ„λ½λ˜μ–΄μžˆμ–΄ μΆ”κ°€ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Add Continuous deploy process
- rollout ν•΄μ£ΌλŠ” 뢀뢄을 μΆ”κ°€ν•¨μœΌλ‘œμ¨ 무쀑단 배포할 수 있게 λ˜μ—ˆμŠ΅λ‹ˆλ‹€.

* Modify Git Action workflows
-  continuous deploymentsλ₯Ό μœ„ν•΄ ν•„μš”ν•œ deployments만 μž¬μ‹œμž‘ λ˜λ„λ‘ λΆ„λ¦¬ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Modify Git Action workflows
-  continuous deploymentsλ₯Ό μœ„ν•΄ ν•„μš”ν•œ kubernetes deployments만 μž¬μ‹œμž‘ λ˜λ„λ‘ λΆ„λ¦¬ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Feature/readme (#57)

Update README

* Feature/data load (#58)

update data load

* Update phase2.md

* Feature/readme (#59)

Update README

* Feature/readme (#61)

* Update README

READMEμ—λ‚΄μš©μ„ μΆ”κ°€ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Add phase1 info

Phase1 에 λŒ€ν•œ λ‚΄μš©μ„ 쑰금 μΆ”κ°€ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Add phase1 info

* Modify phase1 info

* Add phase2-local

* Add logos
λ‘œκ³ μΆ”κ°€

* Resize images

* Add requirements.sh

* Modify main readme page
그림도 넣ꡬ ꡬ쑰도 쑰금 λ°”κΎΈμ—ˆμŠ΅λ‹ˆλ‹Ή

* Delete phase2.PNG

* Add figure
phase2 κ·Έλ¦Ό μΆ”κ°€ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Modify readme file
λ‚΄μš©μ˜ μˆœμ„œλ₯Ό λ³€κ²½ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Remove local.md

* Add info phase2

* Add frontend link

* Modify readme
ν”„λ‘œμ νŠΈ μ†Œκ°œλΆ€λΆ„ μˆ˜μ •ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Modify readme
도컀 μ»¨ν…Œμ΄λ„ˆλ₯Ό μ‹€ν–‰ν•˜λŠ” 뢀뢄에 λŒ€ν•œ μ„€λͺ…을 μΆ”κ°€ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

* Add readme

* Update phase2.md

* Update README.md

* Delete kubernetes nodeselector
1. deployments에 nodeSelectorκ°€ μ„€μ •λ˜μ–΄μžˆλ˜ 뢀뢄을 μ‚­μ œν–ˆμŠ΅λ‹ˆλ‹€.

* Modify readme.md
1. readme file의 전체적인 ꡬ쑰λ₯Ό λ³€κ²½ν•˜μ˜€μŠ΅λ‹ˆλ‹€.
2. Phase2의 μ„ΈλΆ€λ‚΄μš©μ„ λ³€κ²½λœ ꡬ쑰에 λ§žμΆ”μ–΄ λ‹€μ‹œ μž‘μ„±ν•˜μ˜€μŠ΅λ‹ˆλ‹€.
3. 아직 μž‘μ„±μ€‘μž…λ‹ˆλ‹€γ…œγ…œ

* Modify README.md
1. readme μ—μ„œ phase2 ν”„λ‘œμ νŠΈλ₯Ό μ„€λͺ…ν•œ 뢀뢄을 μˆ˜μ •ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

Co-authored-by: ehddnr301 <[email protected]>

Co-authored-by: ehddnr301 <[email protected]>
  • Loading branch information
chl8469 and ehddnr301 authored Dec 9, 2021
1 parent 22d57cb commit 80866f5
Show file tree
Hide file tree
Showing 65 changed files with 2,963 additions and 153 deletions.
52 changes: 52 additions & 0 deletions .github/workflows/build_apiserver.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
name: Build API server container
on:
push:
branches: [ main ]

jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Check Out Repo
uses: actions/checkout@v2

- name: Login to Docker Hub
uses: docker/login-action@v1
with:
username: ${{ secrets.DOCKER_HUB_USERNAME }}
password: ${{ secrets.DOCKER_HUB_ACCESS_TOKEN }}

- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v1

- name: Build and push api-server
id: api-server
uses: docker/build-push-action@v2
with:
context: ./
file: ./Dockerfile.fastapi
push: true
tags: ${{ secrets.DOCKER_HUB_USERNAME }}/mlops-project:api-server-1.0

- name: Build and push prefect-worker
id: prefect-worker
uses: docker/build-push-action@v2
with:
context: ./
file: ./Dockerfile.prefect
push: true
tags: ${{ secrets.DOCKER_HUB_USERNAME }}/mlops-project:prefect-worker-1.0

- name: Image digest
run: echo ${{ steps.docker_build.outputs.digest }}

- name: Deploy
uses: appleboy/ssh-action@master
with:
host: ${{ secrets.REMOTE_IP }}
username: ${{ secrets.REMOTE_SSH_ID }}
port: ${{ secrets.REMOTE_SSH_PORT }}
key: ${{ secrets.REMOTE_SSH_KEY }}
script: |
kubectl rollout restart -f ./MLOps/k8s/prepi_deployments.yaml
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,8 @@ __pycache__
tf_model/**/*
log.txt
experiments/**/temp/
.ssl/
prefect/atmos_tmp_pipeline/ray_mlflow
prefect/atmos_tmp_pipeline/*.sh
mlruns
exp_models
3 changes: 2 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,5 @@ repos:
rev: 5.6.4
hooks:
- id: isort
language_version: python3
language_version: python3
args: ["--profile", "black"]
8 changes: 8 additions & 0 deletions Dockerfile.baseimage
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
FROM python:3.8

COPY requirements.txt /requirements.txt

RUN pip install --upgrade pip &&\
pip install --no-cache-dir -r requirements.txt &&\
pip uninstall -y tensorflow==2.6 &&\
pip install --no-cache-dir tensorflow-cpu==2.4
7 changes: 7 additions & 0 deletions Dockerfile.fastapi
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
FROM hl8469/mlops-project:base-image-1.0

COPY . /

EXPOSE 8000

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "3"]
8 changes: 8 additions & 0 deletions Dockerfile.prefect
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
FROM hl8469/mlops-project:base-image-1.0

COPY ./prefect /prefect
COPY ./set_prefect.sh /

RUN prefect backend cloud

CMD /set_prefect.sh
142 changes: 138 additions & 4 deletions README.md

Large diffs are not rendered by default.

4 changes: 4 additions & 0 deletions app/api/schemas.py β†’ app/api/data_class.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,3 +29,7 @@ class ModelCorePrediction(BaseModel):
class ModelCore(ModelCoreBase):
class Config:
orm_mode = True


class MnistData(BaseModel):
mnist_num: str
162 changes: 124 additions & 38 deletions app/api/router/predict.py
Original file line number Diff line number Diff line change
@@ -1,18 +1,34 @@
# -*- coding: utf-8 -*-
import ast
import asyncio
import os
from typing import List

import mlflow
import numpy as np
import pandas as pd
import torchvision.transforms as transforms
import xgboost as xgb
from dotenv import load_dotenv
from fastapi import APIRouter
from starlette.concurrency import run_in_threadpool

from app import models
from app.api.schemas import ModelCorePrediction
from app import schema
from app.api.data_class import MnistData, ModelCorePrediction
from app.database import engine
from app.utils import ScikitLearnModel, my_model
from app.query import SELECT_BEST_MODEL
from app.utils import CachingModel, VarTimer, load_data, softmax
from logger import L

models.Base.metadata.create_all(bind=engine)
load_dotenv()

schema.Base.metadata.create_all(bind=engine)

host_url = os.getenv("MLFLOW_HOST")
mlflow.set_tracking_uri(host_url)
reset_sec = 5
CLOUD_STORAGE_NAME = os.getenv("CLOUD_STORAGE_NAME")
CLOUD_VALID_MNIST = os.getenv("CLOUD_VALID_MNIST")

router = APIRouter(
prefix="/predict",
Expand All @@ -21,66 +37,136 @@
)


@router.put("/insurance")
async def predict_insurance(info: ModelCorePrediction, model_name: str):
"""
정보λ₯Ό μž…λ ₯λ°›μ•„ λ³΄ν—˜λ£Œλ₯Ό μ˜ˆμΈ‘ν•˜μ—¬ λ°˜ν™˜ν•©λ‹ˆλ‹€.
Args:
info(dict): λ‹€μŒμ˜ 값듀을 μž…λ ₯λ°›μŠ΅λ‹ˆλ‹€. age(int), sex(int), bmi(float), children(int), smoker(int), region(int)
Returns:
insurance_fee(float): λ³΄ν—˜λ£Œ μ˜ˆμΈ‘κ°’μž…λ‹ˆλ‹€.
"""

def sync_call(info, model_name):
"""
none sync ν•¨μˆ˜λ₯Ό sync둜 λ§Œλ“€μ–΄ μ£ΌκΈ° μœ„ν•œ ν•¨μˆ˜μ΄λ©° μž…μΆœλ ₯은 λΆ€λͺ¨ ν•¨μˆ˜μ™€ κ°™μŠ΅λ‹ˆλ‹€.
"""
model = ScikitLearnModel(model_name)
model.load_model()

info = info.dict()
test_set = np.array([*info.values()]).reshape(1, -1)

pred = model.predict_target(test_set)
return {"result": pred.tolist()[0]}
mnist_model = CachingModel("pytorch", 600)
knn_model = CachingModel("sklearn", 600)
data_lock = asyncio.Lock()
train_df = VarTimer(600)


@router.put("/mnist")
async def predict_mnist(item: MnistData):
global train_df
global mnist_model, knn_model

item2 = np.array(ast.literal_eval(item.mnist_num)).astype(np.uint8)
model_name = "mnist"
model_name2 = "mnist_knn"
is_cloud = False
data_version = 1
exp_name = 'mnist'

if not isinstance(train_df._var, pd.DataFrame):
async with data_lock:
if not isinstance(train_df._var, pd.DataFrame):
df, _ = load_data(is_cloud, data_version, exp_name)
train_df.cache_var(df)

transform = transforms.Compose(
[transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]
)
reshaped_input = item2.reshape(28, 28)
transformed_input = transform(reshaped_input)
transformed_input = transformed_input.view(1, 1, 28, 28)

await mnist_model.get_model(model_name, load_type="production")
await knn_model.get_model(model_name2, load_type="production")

def sync_call(mnist_model, knn_model, train_df):
# Net1
result = mnist_model.predict(transformed_input)
p_res = softmax(result.detach().numpy()) * 100
percentage = np.around(p_res[0], 2).tolist()
# Net2
result = mnist_model.predict(transformed_input, True)
result = np.concatenate((result.detach().numpy(), np.array(percentage).reshape(1,-1) / 10), axis=1)
# KNN
knn_result = knn_model.predict(result)
xai_result = train_df.get_var().iloc[knn_result, 1:].values[0].tolist()
return {
"result": {
"percentage": percentage,
"answer": percentage.index(max(percentage)),
"xai_result": xai_result,
},
"error": None,
}

try:
result = await run_in_threadpool(sync_call, info, model_name)
result = await run_in_threadpool(
sync_call, mnist_model, knn_model, train_df
)
L.info(
f"Predict Args info: {info}\n\tmodel_name: {model_name}\n\tPrediction Result: {result}"
f"Predict Args info: {item.mnist_num}\n\tmodel_name: {model_name}\n\tPrediction Result: {result}\n\tcolor_avg_{result['result']['answer']}: {np.round(np.mean(item2), 2)}"
)
return result

except Exception as e:
L.error(e)
return {"result": "Can't predict", "error": str(e)}


@router.put("/atmos")
async def predict_temperature(time_series: List[float]):
insurance_model = CachingModel("xgboost", 30)


@router.put("/insurance")
async def predict_insurance(info: ModelCorePrediction):
info = info.dict()
test_set = xgb.DMatrix(np.array([*info.values()]).reshape(1, -1))

model_name = "insurance"
await insurance_model.get_model(model_name, load_type="production")
result = insurance_model.predict(test_set)

result = float(result[0])
return {
"result": result,
"error": None,
}


lock = asyncio.Lock()
atmos_model_cache = VarTimer()


@router.put("/atmos_temperature")
async def predict_temperature_(time_series: List[float]):
"""
μ˜¨λ„ 1μ‹œκ°„ 간격 μ‹œκ³„μ—΄μ„ μž…λ ₯λ°›μ•„ 이후 24μ‹œκ°„ λ™μ•ˆμ˜ μ˜¨λ„λ₯Ό 1μ‹œκ°„ κ°„κ²©μ˜ μ‹œκ³„μ—΄λ‘œ μ˜ˆμΈ‘ν•©λ‹ˆλ‹€.
Args:
time_series(List): 72μ‹œκ°„ λ™μ•ˆμ˜ 1μ‹œκ°„ 간격 μ˜¨λ„ μ‹œκ³„μ—΄ μž…λ‹ˆλ‹€. 72개의 μ›μ†Œλ₯Ό κ°€μ Έμ•Ό ν•©λ‹ˆλ‹€.
Returns:
List[float]: μž…λ ₯받은 μ‹œκ°„ 이후 24μ‹œκ°„ λ™μ•ˆμ˜ 1μ‹œκ°„ 간격 μ˜¨λ„ 예츑 μ‹œκ³„μ—΄ μž…λ‹ˆλ‹€.
"""

global lock

if len(time_series) != 72:
L.error(f"input time_series: {time_series} is not valid")
return {"result": "time series must have 72 values", "error": None}

model_name = "atmos_tmp"

if not atmos_model_cache.is_var:
async with lock:
if not atmos_model_cache.is_var:
run_id = engine.execute(
SELECT_BEST_MODEL.format(model_name)
).fetchone()[0]
print("start load model from mlflow")
atmos_model_cache.cache_var(
mlflow.keras.load_model(f"runs:/{run_id}/model")
)
print("end load model from mlflow")

def sync_pred_ts(time_series):
"""
none sync ν•¨μˆ˜λ₯Ό sync둜 λ§Œλ“€μ–΄ μ£ΌκΈ° μœ„ν•œ ν•¨μˆ˜μ΄λ©° μž…μΆœλ ₯은 λΆ€λͺ¨ ν•¨μˆ˜μ™€ κ°™μŠ΅λ‹ˆλ‹€.
"""
time_series = np.array(time_series).reshape(1, -1, 1)
result = my_model.predict_target(time_series)

time_series = np.array(time_series).reshape(1, 72, 1)
result = atmos_model_cache.get_var().predict(time_series)
atmos_model_cache.reset_timer()
L.info(
f"Predict Args info: {time_series.flatten().tolist()}\n\tmodel_name: {my_model.model_name}\n\tPrediction Result: {result.tolist()[0]}"
f"Predict Args info: {time_series.flatten().tolist()}\n\tmodel_name: {model_name}\n\tPrediction Result: {result.tolist()[0]}"
)

return {"result": result.tolist(), "error": None}
Expand Down
Loading

0 comments on commit 80866f5

Please sign in to comment.