Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create dim_platform model #1364

Merged
merged 5 commits into from
Nov 8, 2024

Conversation

rachellougee
Copy link
Contributor

@rachellougee rachellougee commented Nov 4, 2024

What are the relevant tickets?

#1353

Description (What does it do?)

Creating dim_platform seeded by a csv file

How can this be tested?

dbt build --select +dim_platform

@rachellougee rachellougee linked an issue Nov 4, 2024 that may be closed by this pull request
@rachellougee rachellougee marked this pull request as ready for review November 5, 2024 16:11
2,edX.org,MIT Online courses on edX.org,www.edx.org
3,MIT xPRO,MIT Professional development courses and programs that build skills and augment careers,xpro.mit.edu
4,MITx Residential,MITx Residential,lms.mitx.mit.edu
5,MIT Bootcamps,MIT Bootcamps, bootcamp.odl.mit.edu
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pdpinch Do you have any suggestions on the platform_description?

I grabbed some text from https://openlearning.mit.edu/, but it needs some editing
image

@KatelynGit
Copy link
Contributor

Should we include OCW?

Copy link
Contributor

@KatelynGit KatelynGit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. One thing is should the platform id be a numeric data type? Usually numeric types are better for id's for size and speed reasons. And does the platform info in the csv match the info coming form the applications?

@rachellougee
Copy link
Contributor Author

rachellougee commented Nov 7, 2024

@KatelynGit Thanks for the review.

Should we include OCW?

Sure. I added OCW although we don't include OCW into the combined mart. It's good to have that in place.

One thing is should the platform id be a numeric data type? Usually numeric types are better for id's for size and speed reasons.

I believe we have discussed this in RFC https://github.com/mitodl/hq/pull/5749#discussion_r1805182430
and the adopted decision https://github.com/mitodl/hq/blob/main/rfcs/0004-data-warehouse-dimensional-marts.md#schema-considerations. The hash key makes sense to me as we need a non-numeric primary key for tables like combined users. I thought we were on the same page, but let me know if we need to revisit this discussion.

And does the platform info in the csv match the info coming form the applications?

Can you say more about this? Not sure if I understand what you asking.

@KatelynGit
Copy link
Contributor

KatelynGit commented Nov 7, 2024

@KatelynGit Thanks for the review.

Should we include OCW?

Sure. I added OCW although we don't include OCW into the combined mart. It's good to have that in place.

Thanks for adding OCW.

One thing is should the platform id be a numeric data type? Usually numeric types are better for id's for size and speed reasons.

I believe we have discussed this in RFC mitodl/hq#5749 (comment) and the adopted decision https://github.com/mitodl/hq/blob/main/rfcs/0004-data-warehouse-dimensional-marts.md#schema-considerations. The hash key makes sense to me as we need a non-numeric primary key for tables like combined users. I thought we were on the same page, but let me know if we need to revisit this discussion.

ok yes a hash key makes sense

And does the platform info in the csv match the info coming form the applications?

Can you say more about this? Not sure if I understand what you asking.

I just meant for things like the platform names this new table lists the platform as "MIT xPRO" but in the marts it is displayed differently like "xPro" in marts__combined__orders etc. Do you think we should change it to match what we have in the marts or applications currently?

@rachellougee
Copy link
Contributor Author

rachellougee commented Nov 7, 2024

I just meant for things like the platform names this new table lists the platform as "MIT xPRO" but in the marts it is displayed differently like "xPro" in marts__combined__orders etc. Do you think we should change it to match what we have in the marts or applications currently?

Thank you for clarifying that. For clarity, I intended to spell out MIT xPRO instead of xPRO in the new model to match with https://openlearning.mit.edu/. Currently, xPRO in the mart is pulled from a variable defined in https://github.com/mitodl/ol-data-platform/blob/main/src/ol_dbt/dbt_project.yml#L36.

Unless we hardcoded the platform name somewhere on the superset dataset, this new value should flow to the mart and dashboard without any side effects. I can match it to what we currently have if needed. I believe these platform names are just for display and filtering.

@KatelynGit
Copy link
Contributor

I just meant for things like the platform names this new table lists the platform as "MIT xPRO" but in the marts it is displayed differently like "xPro" in marts__combined__orders etc. Do you think we should change it to match what we have in the marts or applications currently?

Thank you for clarifying that. For clarity, I intended to spell out MIT xPRO instead of xPRO in the new model to match with https://openlearning.mit.edu/. Currently, xPRO in the mart is pulled from a variable defined in https://github.com/mitodl/ol-data-platform/blob/main/src/ol_dbt/dbt_project.yml#L36.

Unless we hardcoded the platform name somewhere on the superset dataset, this new value should flow to the mart and dashboard without any side effects. I can match it to what we currently have if needed. I believe these platform names are just for display and filtering.

ok that sounds good! I'll go in and approve

@rachellougee rachellougee merged commit 00bd178 into main Nov 8, 2024
4 checks passed
@rachellougee rachellougee deleted the 1353-create-dim_platform-dimensional-table branch November 8, 2024 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create dim_platform dimensional model
2 participants