How do we structure our dbt projects in 2022 and beyond? Discussing the creation of a new Guide. #1284
Replies: 8 comments 13 replies
-
Hmm should we include our use of reverse ETL internally @ dbt Labs (and our use of an |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Re-commenting since I wasn't done with my thoughts! Things I love about this article:
This is a classic for a reason! But here's some things I was thinking about as I reread this for the 123921389 time:
Actual AE things I've come to like:
|
Beta Was this translation helpful? Give feedback.
-
agree so much with the above! One of the most frequent and most important concepts is the one-to-one staging layer -- this concept I view as more of a rule than a suggestion. But why do we build it? Why is it so important? Why would it be a rule and not a guideline?! That needs to be explained here, otherwise it seems like extra and unnecessary work just for the sake of structure and organization. Would love to help craft this convincing piece of writing. Other concepts I think are more rule-like:
The other pieces I'd like to see as options. Here are several if-then examples:
I'd love to see mentions of the accessories -- even though I know we'll cover them in separate posts. but more like, hey we know your data doesn't end in dbt! it may end up in these places. look out for some best practice guides there. |
Beta Was this translation helpful? Give feedback.
-
^ 💯 I've been thinking a lot about the delineation you made between rules and options and I really appreciate it as a framework for this revamp. |
Beta Was this translation helpful? Give feedback.
-
Agree with all of the above! I have lots of thoughts on all of this, but the things that stick out the most to me are:
diligence in the right areas pays dividends. |
Beta Was this translation helpful? Give feedback.
-
it lives! https://docs.getdbt.com/guides/best-practices/how-we-structure/1-guide-overview |
Beta Was this translation helpful? Give feedback.
-
I am late to this party, but am finally reading through! It is very good. I have a handful of nitpicks and comments, mostly around SQL style as opposed to naming structure specifically.
One other thought: I think it would be useful when providing an antipattern example to make that clear in the image itself, for ease of skimming |
Beta Was this translation helpful? Give feedback.
-
What are we doing?
The How we structure our dbt projects post is one of the most popular and relied upon works of analytics engineering knowledge created to date. It's been a long time (in data years) since it was published though, and we have a new, improved system for sharing knowledge on the Developer Hub.
We decided it was time to update this classic as our first Guide on the new platform! While we still have many opinions shaped by our consulting, teaching, and solutions work with companies of all sizes, this time we wanted to make sure we talked in-depth with you all, to fold your voices into our recommendations. While this Guide will still represent dbt Labs Best Practices, it's important to us that these are informed and improved by the Community. Particularly, we want to hear about any important areas you felt the original didn't cover, or areas where you strongly disagreed!
We also have some specific questions we're discussing internally about changes we've made to naming and other principles that we'll aim to share with you all soon, so if you're interested, we'll branch another discussion off of this one within the next couple weeks.
Some questions to consider
Consider all of the following as potential prompts for thinking about this core set of questions: what aspects of the original How we structure our dbt projects post influenced you the most? What stuck with you? What was missing? What did you invent for yourself? Where did you diverge from dbt Labs' best practices over time? Where have you always disagreed? We'd love to understand what we can do to improve the coverage and structure of the new guide as we update the content and platform it lives on.
Some areas that might spark ideas on the above: How do you manage files and folder structure? How do you split up YAML files within that? What sort of naming conventions do you rely on? How do these intersect with your modeling approach? How do files and folder conventions intersect with tagging, YAML selectors, and selector syntax overall -- in both development and jobs? Do you use snapshots (an area not covered by the original guide) and if so how and where? What macros do you always override or packages do you reach for all the time? What do you do with the
analysis
folder?What we're not covering in this guide
wardiscussion over commas for a different thread. 😄Thank you
We really appreciate you all taking the time to share your thoughts on the next generation of this guide with us!
Beta Was this translation helpful? Give feedback.
All reactions