-
Notifications
You must be signed in to change notification settings - Fork 705
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
predict sql submit predict job as maxcompute udf script #605
predict sql submit predict job as maxcompute udf script #605
Conversation
Execute the ODPS query failed:
This PR uses |
# 7. Load sqlflow Jupyter magic command automatically. c.f. https://stackoverflow.com/a/32683001. | ||
# 7. Install odpscmd for submitting alps predict job with odps udf script | ||
# TODO(Yancey1989): using gomaxcompute instead of the odpscmd command-line tool. | ||
wget -q http://docs-aliyun.cn-hangzhou.oss.aliyun-inc.com/assets/attach/119096/cn_zh/1557995455961/odpscmd_public.zip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure odpscmd
a public available program.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied the link from Aliyun, I think it's a public program.
sql/codegen_alps.go
Outdated
@@ -40,6 +43,7 @@ type alpsFiller struct { | |||
ModelDir string | |||
ScratchDir string | |||
PredictOutputTable string | |||
PredictInputModel string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can reuse ModelDir
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, fixed it.
sql/sql.y
Outdated
$$.model = $4 | ||
} | ||
: PREDICT IDENT { $$.into = $2 } | ||
| PREDICT WITH attrs { $$.predAttrs = $3 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious why not PREDICT IDENT WITH attrs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed it.
sql/sql.y
Outdated
} | ||
: PREDICT IDENT { $$.into = $2 } | ||
| PREDICT WITH attrs { $$.predAttrs = $3 } | ||
| predict_clause USING IDENT { $$.model = $3 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this line mean we can write multiple USING
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, and found some same syntax bug, such as SELECT xx FROM a FROM b
, SELECT xxx from table TRAIN model COLUMN a,b COLUMN c,d
, I issued a bug: #616 , and will fix it in another pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good to me.
sql/codegen_alps.go
Outdated
modelDir := fmt.Sprintf("oss://arks-model/%s/%s.tar.gz", session.UserId, pr.predictClause.model) | ||
|
||
return &alpsFiller{ | ||
IsTraining: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why predict should set IsTraining: true,
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
func alpsPred(w *PipeWriter, pr *extendedSelect, db *DB, cwd string, session *pb.Session) error { | ||
f, err := newALPSPredictFiller(pr) | ||
fname := "alps_pre.odps" | ||
filepath := filepath.Join(cwd, fname) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defer clean up this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
…flow into alps_predict_udf
Related issue: #604
An example SQL can be like:
TODO: sql-machine-learning/gomaxcompute#46