In the last article, I shared a framework to help you answer the question, “Should I become a data scientist (or business analyst)?“. For the people, who clear the cut-offs, the next obvious question is “How do I become a data scientist?” In this article, I’ll share what I would have done if I was starting my journey for a career in data science.
Starting a data science career without proper guidance and planning can be confusing. We have compiled a clear-cut free roadmap guide to building a career in Data Science that is created by expert curators at Analytics Vidhya –
I started my career as an analyst without any knowledge about the tools I was going to work on – all I knew was how to create basic models in Excel. I had not heard about Pivot tables and didn’t know something like conditional formatting even existed in Excel!
Thankfully, Capital One hired me for my logical thinking and not for the knowledge of the tools, I would need to use. In the following years, by working with several employers, freelancing, and doing a few pet projects – I learned several tools and techniques – SAS, SPSS, R, and Python included!
Having said that, if I was starting my career today, would I choose the same path? The answer is NO. I would take up a very different path, than what I did. This path would not only cut out the period of confusion I had but also uses some of the dramatic shifts which have happened in the analytics industry in the past few years.
So, I thought, I would share how I would plan out my journey to become a data scientist – if I had to chart out my career path today. Here is how I would plan out my journey (in chronological order):
Thankfully, this didn’t change much for me. Education makes a huge difference in your prospects to start in this industry. Most of the companies who do fresher hiring, pick out people from the best colleges directly. So, by entering into a top-tier university, you give yourself a very strong chance to enter the data science world.
Ideally, I would take up Computer Science as the subject of study. If I didn’t get a seat in the Computer Science batch, I’ll take up a subject that has close ties with the computational field – e.g. computational neuroscience, Computational Fluid Dynamics, etc.
This is probably the biggest change, which would happen in the journey if I was passing out now. If you spend even a year studying the subject by participating in these open courses, you will be in far better shape vs. other people vying to enter the industry. It took me 5+ years of experience to relate to the power that R and Python bring to the table. You can do this today by taking up various courses.
One word of caution here is to be selective on the courses you choose. I would focus on learning one stack – R or Python. I would recommend Python over R today – but that is a personal choice.
This is to get some real-world experience before you actually venture out. This should also provide you an understanding of the work which happens in the real world. You would get a lot of exposure to real-world challenges on data collection and cleaning here.
You should aim to get at least a top 10% finish on Kaggle before you are out of your university. This should bring you in eyes of the recruiters quickly and would give you a strong launchpad. Beware, this sounds a lot easier than it actually is. It can take multiple competitions for even the smartest people to make it to the top 10% on Kaggle.
Here is an additional tip to amplify the results from your efforts – share your work on Github. You don’t know which employer might find you from your work!
I would take up a job in a start-up, which is doing awesome work in analytics/machine learning. The amount of learning you can gain for the slight risk can be amazing. There are start-ups working on deep learning, reinforcement learning – choose the one which fits you right (taking culture into account)
If you are not the start-up kind, join an analytics consultancy, which works on tools and problems across the spectrum. Ask for projects in different domains, work on different algorithms, try out new approaches. If you can’t find a role in a consultancy – take up a role in captive units, but seek a role change every 12 – 18 months. Again this is a general guideline – adapt it depending on the learning you are having in the role.
What do you think about this path towards a career in data science? Do you have additional tips, which can help people making their career choices? Please feel free to post these tips below for the benefit of a larger audience.
photo credit: Indy Kethdy via photopin cc
Awesome write up kunal !!!
Thanks Pradeep
Hi Kunal, I have 7 year of IT exp in development. I gone through all your post and found that very great resource of Knowledge I have some query and confusion in my mind. Could you please help me on that. 1- I went through you last post (“Should I become a data scientist (or business analyst") and judge my self and score "54". What should i do? 2- As i have 7 years of exp in IT development. Is changing career a good idea.? 3- Which field is better Big data Or Data Science Or Business Analyst ? 4- I found course on jigsaw for (Big data and Data science) are they good for starting or is there any other better way courses in that area. 5- Are you providing any type of training from your end. If yes please update me i want to join them. Thanks Dinesh
Dinesh, Here are answers to your queries: 1. Which areas were the ones which require most improvement? 2. It depends on how you feel about your current area. I usually advice against making late areer shift, until an dunless you are dead sure about making the shift. You can read more details here. 3. Given your background in IT, BIG Data might be the best bet. But it depends on your exact experience. 4. Courses from Jigsaw are good. You can also take up a basic course on Big data on Udacity - but it is a basic course. 5. We have a basic training running for college students - focusing on Excel. Apart form that, we are not running any other trainings. Regards, Kunal
Great article Kunal. I'm in the final year of my MS Statistics course. I am comfortable with using R. Should I also do another course on SAS? Will it help? And if you could tell me what exactly the top analytics companies look for in a candidate when they hire for the role of data scientist. Lastly, have you taken the hadoop course of udacity yourself? Is the free version of it good enough?
Kingshuk, I would have kept the focus on R. SAS is easier to learn and can be picked up quickly More so, if you are from stats background. The interviews for analytics typically happen in form of business case studies and guess estimates along with test for technical skills. You can read more about these interview here. A few companies have also started arranging Hackathons to solve for their hiring problems out of college. On the course on Udacity - it is a basic course. I have done it myself some time back. If you are interested more in big data - you can also try out bigdatauniversity.com Regards, Kunal