sd

Switching from a Data Science Hobbyist to a Professional?


As a data hobbyist, we might build many models with different data sets and alter these to analyze the outputs. But in the process, we forget to ask ourselves, “How is it bringing value to the company?”

Working as a data hobbyist limits your communication with clients and dealing with real projects. You may gather enough information for data science from books and the internet, of course! However, you miss out on the impact your work has on business. For instance, you could have a model with 99% accuracy, but then it will be of no use if it can’t provide any value to the clients.

Here are a few things which only can be learned by working in projects:

1. Knowing your client

Instead of jumping directly into analytics and geeking out in the technical stuff, it is necessary to know the client’s domain and what interests the business most. It will be easier to explain things to the owner once you know their background and business.

2. Ask what kind of analytics is profitable

It will save you a lot of time and effort when you know where we should dig for gold nuggets. By understanding the requirement above, you’ll be able to deliver what is expected of you. Of course, analysis can vary according to people, try analyzing the same data by two different persons; you never know what will come up.

3. Ask what they already know

Since you will be working in a data analytics-based project, chances are the business owners already know a lot about the sector. So don’t hesitate to ask what are the known trends in the field. You can then cross-validate those and maybe explore more. It will most probably be beneficial to carve some way for analysis.

4. Know the source of data

There was a case where we had found apparent anomalies in an attribute. Only later, we found that the whole column of data was made up of the column that we had to predict. So, remember to know your data well and know which ones are trustworthy and which not.

5. Focus on interpretable models

If you are predicting customer churn, it would be of no use to predict accurately. However, it will be crucial to understand why and preventing those churns. For example, it is illegal to reject loan applications in countries without providing satisfactory reason and “because my model said so” is not satisfactory.

6. Present your results from your client’s perspective

A big portion of data science skills consists of presentation skill. Presenting the technical knowledge you have and speaking in such a way that you create a bridge between numbers and the company goal will make your life a lot easier. Bonus- it will also get your team’s work appreciated.

Data Science Work Cycle

Now, let’s look into the data work cycle briefly and a few tips if you’re starting your career as a data scientist.

Data science process

Source: Becoming Human: AI Magazine

Every data science work cycle follows this rough cycle of collecting data training model and publishing them. This process completes quickly, and I can’t stress enough on saying how iterative this process is.

Don’t worry if your first model is performing badly because it is supposed to. This creates a baseline metrics for the job, and we keep starting iterating over it again and again. This includes collecting feedback from users, product owner, and tweaking on needs like solving class imbalance or adding a new feature to the input of the model that the company has just started collecting.

Every problem needs its custom solution, so don’t be afraid of coming up with new ideas. And always try to keep your model simple. If we look in the academic world, we can find complex solutions that require 2 times the computation and data, then the previous method increasing the accuracy of 2%. You have to realize and think, “Is this the proper trade-off for real-world applications?”.

Evaluation

Every business requires different methods of evaluation. Some require high precision in the compromise of recall, like skin cancer detection. Whereas some require high recall in the compromise of precision like for a security system opening the door with your face.

Switching from a data hobbyist to a data scientist is going to be challenging but definitively worth the effort. Have fun!

Bipin K.C.

Bipin KC is a Data Scientist/ Machine Learning Engineer at Leapfrog. He has a keen interest in Computer Vision, Interpreting models, and delivering elegant analytic solutions.

More in Blogs

Nepali Image Captioning Using Machine Learning Artificial Intelligence

Nepali Image Captioning Using Machine Learning

Computer vision using deep learning and image processing is at the forefront of realizing the benefits of AI in the

Read more
Dimension Reduction Techniques for Machine Learning Models Artificial Intelligence

Dimension Reduction Techniques for Machine Learning Models

Dimension reduction might sound trivial but is an essential part of the machine learning model. Too much dimension acts as

Read more
Get Started with Federated Learning for Data privacy Artificial Intelligence

Get Started with Federated Learning for Data privacy

Maintaining the privacy of data for the users has become crucial. With the rise of machine learning, the concerns regarding

Read more