Inspired by #100DaysOfCode, I've decided to challenge myself into becoming a Data Engineer by studying and building Data/ML pipeline for 10-12 hours every day for the next 60 days. This started today 3rd of September and should be finished by 4th of November, 2019. My focus will be on ML/DL pipeline and Data Engineering tools around it such as KubeFlow, Apache Airflow, Apache Spark, Apache Kafka, and Tensorflow. I will document my progress on Github and update daily logs in LinkedIn.
Today's Progress: Today was no a productive day. Only finish Week 1 content of Natural Language Processing with Tensorflow course.
Thoughts: I am excited and looking forward to start Insight Data Engineering Fellows Program on September 9th 2019.
Today's Progress: Today I finished the Introduction to TensorFlow for Artificial Intelligence, Machine Learning, and Deep Learning course on Coursera.
Thoughts: It was not a difficult course. However, it gave me a solid understanding of Tensorflow 2.0 API and Convolutional Neural Networks (ConvNets). Building some simple image classifiers were fun.
Useful Links:
👉 Introduction to TensorFlow for Artificial Intelligence, Machine Learning, and Deep Learning https://www.coursera.org/learn/introduction-tensorflow
👉 Fashion MNIST with Keras and TPUs https://research.google.com/seedbank/seed/fashion_mnist_with_keras_and_tpus
👉 Understanding Convolutions https://colah.github.io/posts/2014-07-Understanding-Convolutions/
Today's Progress: Today I started TensorFlow in Practice Specialization from deeplearning.ai
. I am in week 4 of Introduction to TensorFlow for Artificial Intelligence course.
Thoughts: I like the way Al Advocate (Instructor) introduced Convolutional neural network by building a simple classifier using fashion mnist dataset and Tensorflow. TensorFlow in Practice Specialization is hands-on. Looking forward to learn more about TensorFlow.
Useful Links:
👉 TensorFlow in Practice Specialization https://www.coursera.org/specializations/tensorflow-in-practice
👉 Different Convolution Filters https://lodev.org/cgtutor/filtering.html
👉 Machine Learning Fairness https://developers.google.com/machine-learning/fairness-overview/
👉 Collection of Interactive Machine Learning Examples https://research.google.com/seedbank/
👉 Step-by-step Guide to Install TensorFlow 2 https://medium.com/@cran2367/install-and-setup-tensorflow-2-0-2c4914b9a265
Today's Progress: I wrote a blog post on LinkedIn where I explained Apache Airflow core concepts.
Thoughts: There are so many interesting concepts in Airflow. It is an excellent tool for workflow orchestration. I want to spend more time on building custom Operator, Hook and data pipeline.
Link to work: Apache Airflow Core Concepts
Here are some useful links:
👉 A Definitive Compilation of Apache Airflow Resources - Aakash Pydi https://towardsdatascience.com/a-definitive-compilation-of-apache-airflow-resources-82bc4980c154
👉 DAG Writing Best Practices in Apache Airflow https://www.astronomer.io/guides/dag-best-practices/
👉 Automate AWS Tasks Thanks to Airflow Hooks - Arnaud https://blog.sicara.com/automate-aws-tasks-boto3-airflow-hooks-593c3120e8fc
👉 Getting started with Apache Airflow - Adnan Siddiqi https://towardsdatascience.com/getting-started-with-apache-airflow-df1aa77d7b1b
👉 Orchestration and DAG Design in Apache Airflow — Two Approaches https://medium.com/hashmapinc/orchestration-and-dag-design-in-apache-airflow-two-approaches-35edd3eaf7c0
👉 Apache Airflow Core Concepts - Zahidul Islam https://www.linkedin.com/pulse/apache-airflow-core-concepts-zahidul-islam/?trackingId=X3YNEn0IQHehblxk9G0Z7Q%3D%3D
Today's Progress: Spent time learning about Apache Airflow. Airflow is a platform to programmatically author, schedule and monitor workflows. Link: https://airflow.apache.org/index.html
Thoughts: Very happy with my progress, and excited to start building a Dynamodb to BigQuery ETL pipeline using Airflow tomorrow.
There are so many excellent blogs on Airflow. Today I want to share some beginner-friendly resources:
👉 Airflow official documentation https://airflow.apache.org/index.html
👉 Apache Airflow for the confused - Jonathan Pichot https://medium.com/nyc-planning-digital/apache-airflow-for-the-confused-b588935669df
👉 Apache Airflow: Tutorial and Beginners Guide https://www.polidea.com/blog/apache-airflow-tutorial-and-beginners-guide/
👉 Apache Airflow on Docker for Complete Beginners https://medium.com/@itunpredictable/apache-airflow-on-docker-for-complete-beginners-cf76cf7b2c9a
👉 Understanding Apache Airflow’s key concepts https://medium.com/@dustinstansbury/understanding-apache-airflows-key-concepts-a96efed52b1a
👉 How to start automating your data pipelines with Airflow - Sriram Baskaran https://blog.insightdatascience.com/airflow-101-start-automating-your-batch-workflows-with-ease-8e7d35387f94