Week 1: 10/9/20 - 10/16/20
In my quest to further improve my overall data science skills, I pulled the trigger on October 9th, 2020, and enrolled in a Data Engineering boot camp lead by Andreas Kretz. First a little bit about myself.
I have a background in Aerospace Engineering and have been in the industry for close to 15 years now. A little more than a year ago, I decided to pivot to Machine Learning and Data Science. The world itself is changing rapidly and has been so for quite some time. However, I have seen the exponential growth of tools gaining insights from data as far back as 6 years ago. I am sure this has been going on for longer than 6 years. During that time, I spent reading articles the term discussing data science and machine learning. It got me intrigued especially computer vision where automated drones rely heavily on. To me, recognizing whether an object was a bird or another drone was near magic. Of course, the field of machine learning is way more than just computer vision but touches a myriad of industries where the applications are limitless and thus exciting!
After being in the aerospace/automotive CAE world for more than a decade, I decided to leave and make my foray into Data Science by joining an in-person boot camp at METIS Seattle. It was a difficult 12-week intensive training camp that dives into the best practices of Data Science and its tools. I learned a tremendous amount and continue to do so until today.
After the boot camp, as I was searching for Data Science opportunities in the industry, I noticed the requirements for a Data Scientist has changed from what it was 2-3 years ago. Employers prefer a full-stack Data Scientist that does not only do the machine learning part but can also create an entire data pipeline from start to finish.
During my time reading-up on DS especially on Quora and LinkedIn, I came across articles from Andreas Kretz. This was more than 3 years ago. Fast-forward to today and knowing the holistic requirement from the industry for a Data Scientist, I am excited to be officially enrolled in Andreas' Data Engineering course on Team Data Science!
Module 1 in the curriculum requires that one sets goals. My goal for this course is to create a complete end-to-end data pipeline for an application that is time series related. This will involve sourcing, ingestion, machine learning, business intelligence, and monitoring. The project I am leaning on is perhaps anomaly detection. The specific industry is most likely in financial services, though, the fundamentals itself should not change.
Module 2 requires me to look at job postings. The initial search was for Data Engineer positions close to my area which is Seattle. I did my cursory search online and jotted down the requirements in the industries I was interested in. After discussing my findings with Andreas, he suggested that because the DEs requires a specific skill set than what I have in my arsenal, I should focus instead on Data Science positions. This was great feedback to recalibrate and I re-did the search. This time I not only searched within the Greater Seattle area but also remote positions as this will be the norm well into the mid of 2021.
Most of the requirements centered around knowledge of Python, SQL, machine learning libraries such as scikit-learn, NLTK, and knowledge of cloud technologies such as AWS, Azure, and GCP. Other nice-to-haves are Kafka, PowerBI and/or Tableau, Agile, and, the Java programming language.
Going into this course, I feel a sense of challenge but yet excitement. This is going on while I am job searching and interviewing full-time.
For the curious, below is the calendar breakdown for the Data Engineering course. Let's call this October 2020 cohort.