Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support turbo engine remaining useful life predition #8

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# RealTime Prediction Application Demos with FEDB and SparkFE
# RealTime Prediction Application Demos with OpenMLDB

* [**predict-taxi-trip-duration**](https://github.com/4paradigm/DemoApps/tree/master/predict-taxi-trip-duration)
* [**predict-taxi-trip-duration-notebook**](https://github.com/4paradigm/DemoApps/blob/main/predict-taxi-trip-duration-nb/develop_ml_application_tour.ipynb)
* **detect-online-transaction-exceptions** (oncoming)
* [**predict-taxi-trip-duration**](./predict-taxi-trip-duration)
* [**predict-taxi-trip-duration-notebook**](./demos_based_notebook/develop_ml_application_tour_taxi.ipynb)
* [**predict-turboengine-remainling-useful-lift**](./demos_based_notebook/develop_ml_application_tour_rul.ipynb)

# Help

Expand Down
25 changes: 25 additions & 0 deletions demos_based_notebook/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
from jupyter/all-spark-notebook:latest

RUN mkdir -p /home/jovyan/work/system/
COPY zookeeper-3.4.14.tar.gz /home/jovyan/work/system/
COPY fedb-2.2.0-linux.tar.gz /home/jovyan/work/system/
COPY spark-3.0.0-bin-sparkfe.tgz /home/jovyan/work/system/
COPY fedb-2.2.0-py3-none-any.whl /home/jovyan/work/system/
COPY jdk-8u141-linux-x64.tar.gz /home/jovyan/work/system/

RUN cd /home/jovyan/work/system/ && tar -zxvf jdk-8u141-linux-x64.tar.gz && rm jdk-8u141-linux-x64.tar.gz
ENV JAVA_HOME /home/jovyan/work/system/jdk1.8.0_141
ENV PATH $JAVA_HOME:$PATH

RUN cd /home/jovyan/work/system && tar -zxvf zookeeper-3.4.14.tar.gz && rm zookeeper-3.4.14.tar.gz && cd zookeeper-3.4.14 && mv conf/zoo_sample.cfg conf/zoo.cfg
RUN cd /home/jovyan/work/system && tar -zxvf fedb-2.2.0-linux.tar.gz && rm fedb-2.2.0-linux.tar.gz
RUN cd /home/jovyan/work/system && tar -zxvf spark-3.0.0-bin-sparkfe.tgz && rm spark-3.0.0-bin-sparkfe.tgz && cd spark-3.0.0-bin-sparkfe/python && python3 setup.py install
ENV SPARK_HOME /home/jovyan/work/system/spark-3.0.0-bin-sparkfe/
RUN cd /home/jovyan/work/system && pip install fedb-2.2.0-py3-none-any.whl && rm fedb-2.2.0-py3-none-any.whl
RUN pip install lightgbm tornado requests
COPY rul /home/jovyan/work/rul
RUN pip install -r /home/jovyan/work/rul/requirements.txt
COPY taxi-trip /home/jovyan/work/taxi-trip
COPY develop_ml_application_tour_rul.ipynb /home/jovyan/work/
COPY develop_ml_application_tour_taxi.ipynb /home/jovyan/work/
WORKDIR /home/jovyan/work
File renamed without changes.
150 changes: 150 additions & 0 deletions demos_based_notebook/develop_ml_application_tour_rul.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "687334b0",
"metadata": {},
"source": [
"# 基于机器学习数据库飞速上线AI应用—智能实时预测Turbofan剩余寿命应用开发\n",
"\n",
"剩余使用寿命(remaining useful life,RUL),指一个系统正常工作一段时间后,能够正常运转的时间。借助RUL,工程师可以安排维护时间、优化运行效率并避免计划外停机。因此,预测RUL是预测性维护计划中的首要任务。 \n",
"本次的任务就是开发一个通过机器学习模型进行剩余使用寿命预测的实时智能应用。我们使用NASA提供的[Turbofan Engine Degradation Simulation Data Set](https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/#turbofan),作为训练集与测试集,开发一个智能预测硬件剩余寿命的应用\n",
"\n",
"![turbo](https://ftp.bmp.ovh/imgs/2021/06/ea39f4d2b326d619.jpeg)\n"
]
},
{
"cell_type": "markdown",
"id": "3e7559e1",
"metadata": {},
"source": [
"## 初始化环境\n",
"整个初始化过程包含安装[OpenMLDB](https://github.com/4paradigm/openmldb),以及相关运行环境,初始化脚步可以参考init.sh"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4be401ff",
"metadata": {},
"outputs": [],
"source": [
"!cd rul && sh init.sh"
]
},
{
"cell_type": "markdown",
"id": "9f31d07f",
"metadata": {},
"source": [
"## 导入历史数据到[OpenMLDB](https://github.com/4paradigm/openmldb)\n",
"\n",
"使用[OpenMLDB](https://github.com/4paradigm/openmldb)进行时序特征计算是需要历史数据的,所以我们将历史数据导入到fedb,以便实时推理可以使用历史数据进行特征推理,导入代码可以参考https://github.com/4paradigm/DemoApps/blob/main/predict-taxi-trip-duration-nb/demo/import.py 。\n",
"\n",
"这里使用`test_FD004.txt`作为历史数据,有多个engine多次cycle的检测数据。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7a21f6c7",
"metadata": {},
"outputs": [],
"source": [
"!cd rul && python3 import.py"
]
},
{
"cell_type": "markdown",
"id": "f1217305",
"metadata": {},
"source": [
"## 训练模型\n",
"模型训练需要训练数据,以下是生成使用的的代码\n",
"\n",
"* 训练特征矩阵生成脚本代码 train.py\n",
"* 训练数据使用`train_FD004.txt`,多engine不同cycle的检测数据\n",
"\n",
"最终生成模型"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f321fb05",
"metadata": {},
"outputs": [],
"source": [
"!cd rul && python3 train.py ./fe.sql /tmp/model.txt"
]
},
{
"cell_type": "markdown",
"id": "e843ece0",
"metadata": {},
"source": [
"## 使用训练的模型搭建实时推理http服务\n",
"\n",
"基于上一步生成的训练用特征矩阵,使用`RandomForestRegressor`进行训练。\n",
"得到的训练模型,再结合[OpenMLDB](https://github.com/4paradigm/openmldb)中的历史数据,搭建一个实时推理服务,整个推理服务代码参考predict_server.py"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0cc7e074",
"metadata": {},
"outputs": [],
"source": [
"!cd rul && sh start_predict_server.sh ./fe.sql 9887 /tmp/model.txt"
]
},
{
"cell_type": "markdown",
"id": "4778edc1",
"metadata": {},
"source": [
"## 通过http请求发送一个推理请求"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3aa25374",
"metadata": {},
"outputs": [],
"source": [
"!cd rul && python3 predict.py"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "23cd8660",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.2"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
"metadata": {},
"source": [
"## 初始化环境\n",
"整个初始化过程包含安装fedb,以及相关运行环境,初始化脚步可以参考https://github.com/4paradigm/DemoApps/blob/main/predict-taxi-trip-duration-nb/demo/init.sh"
"整个初始化过程包含安装[OpenMLDB](https://github.com/4paradigm/openmldb),以及相关运行环境,初始化脚步可以参考 init.sh"
]
},
{
Expand All @@ -29,17 +29,17 @@
"metadata": {},
"outputs": [],
"source": [
"!cd demo && sh init.sh"
"!cd taxi-trip && sh init.sh"
]
},
{
"cell_type": "markdown",
"id": "9f31d07f",
"metadata": {},
"source": [
"## 导入行程历史数据到fedb\n",
"## 导入行程历史数据到[OpenMLDB](https://github.com/4paradigm/openmldb)\n",
"\n",
"使用fedb进行时序特征计算是需要历史数据的,所以我们将历史的行程数据导入到fedb,以便实时推理可以使用历史数据进行特征推理,导入代码可以参考https://github.com/4paradigm/DemoApps/blob/main/predict-taxi-trip-duration-nb/demo/import.py"
"使用[OpenMLDB](https://github.com/4paradigm/openmldb)进行时序特征计算是需要历史数据的,所以我们将历史的行程数据导入到[OpenMLDB](https://github.com/4paradigm/openmldb),以便实时推理可以使用历史数据进行特征推理"
]
},
{
Expand All @@ -49,7 +49,7 @@
"metadata": {},
"outputs": [],
"source": [
"!cd demo && python3 import.py"
"!cd taxi-trip && python3 import.py"
]
},
{
Expand All @@ -61,10 +61,10 @@
"\n",
"通过label数据进行模型训练,一下是这次任务使用的代码\n",
"\n",
"* 训练脚本代码 https://github.com/4paradigm/DemoApps/blob/main/predict-taxi-trip-duration-nb/demo/train_sql.py \n",
"* 训练数据 https://github.com/4paradigm/DemoApps/blob/main/predict-taxi-trip-duration-nb/demo/data/taxi_tour_table_train_simple.snappy.parquet\n",
"* 训练脚本代码 train.py \n",
"* 训练数据 data/taxi_tour_table_train_simple.snappy.parquet\n",
"\n",
"整个任务最终会生成一个model.txt"
"整个任务最终会生成一个/tmp/model.txt"
]
},
{
Expand All @@ -74,17 +74,17 @@
"metadata": {},
"outputs": [],
"source": [
"!cd demo && sh train.sh"
"!cd taxi-trip && python3 train.py ./fe.sql /tmp/model.txt"
]
},
{
"cell_type": "markdown",
"id": "e843ece0",
"metadata": {},
"source": [
"## 使用训练的模型搭建链接fedb的实时推理http服务\n",
"## 使用训练的模型搭建链接[OpenMLDB](https://github.com/4paradigm/openmldb)的实时推理http服务\n",
"\n",
"基于上一步生成的模型和fedb历史数据,搭建一个实时推理服务,整个推理服务代码参考https://github.com/4paradigm/DemoApps/blob/main/predict-taxi-trip-duration-nb/demo/predict_server.py"
"基于上一步生成的模型和fedb历史数据,搭建一个实时推理服务,整个推理服务代码参考predict_server.py"
]
},
{
Expand All @@ -94,7 +94,7 @@
"metadata": {},
"outputs": [],
"source": [
"!cd demo && sh start_predict_server.sh"
"!cd taxi-trip && sh start_predict_server.sh ./fe.sql 8887 /tmp/model.txt"
]
},
{
Expand Down Expand Up @@ -135,7 +135,7 @@
"metadata": {},
"outputs": [],
"source": [
"!cd demo && python3 predict.py"
"!cd taxi-trip && python3 predict.py"
]
}
],
Expand Down
Loading