Senior data scientists are very difficult to reach because of the demands on their time. However, these are the people who have very useful insights about data science, and they are generally better equipped to offer actionable advice compared to other data scientists. In a way, they are the most mature professionals in the field and inhabit the role that most data scientists aspire to (including the author of this book).
In this chapter, we will look at a researcher type of data scientist from the Greater Atlanta area, Dr. Nikolaos Vasiloglou. We will examine his background, his views on data science in practice, how he sees the field evolving in the future and what tips he has for new data scientists (and aspiring data scientists). Finally, we’ll end with a summary of the main points from this particular case study.
17.1 Basic Professional Information and Background
Dr. Vasiloglou is a machine learning specialist, i.e., a data scientist who specializes in the machine learning aspect of the field. He works in the software development and mobile advertising industries. Although he has been working as a data scientist for about five years, he has been involved in the field much longer. His PhD was in scalable machine learning techniques, a topic that integrates seamlessly with data science.
Dr. Vasiloglou has been involved in several local groups related to the field, mainly through meetup.com. He was the founder of Machine Learning by Example, a group for students of machine learning (the group is no longer active), a member of Data Science Atlanta (the largest data science group in the state) and groups for Hadoop and programming languages. He also organizes the MLconf conferences, an industry-based type of conference, on machine learning.
He believes that there are two things on his resume that played an important role in jumpstarting his career in data science: internships in well-known companies such as Google and having a PhD in machine learning from a good university (Georgia Tech). For those who are unable to list either one of these credentials on their resume, he recommends getting the machine learning certificate from Stanford University (Prof. Ng’s physical class, not the MOOC on Coursera).
Dr. Vasiloglou is part of a 4-member team at one of the companies for which he works and on his own at the other company. In the team, he is responsible for all of its members and manages them by creating the architectural framework in which they work and by planning the projects in which they are involved.
Dr. Vasiloglou is a very professional individual who at the same time is very down-to-earth and approachable. He can be a fine role model for those who plan to make data science their life-long career.
17.2 Views on Data Science in Practice
This data scientist’s views on data science are based on his experience in the field and his research interests, which revolve around scalable machine learning techniques. His everyday work includes daily report monitoring (for jobs left to run overnight), brainstorming and mini group meetings, debugging problematic code, reading newsletters and conference proceedings and revising current problems (e.g., deep learning networks) to keep himself abreast of new technologies in the field.
According to Dr. Vasiloglou, a senior data scientist differs from the other grades of data scientist in two ways. First, a senior data scientist has more knowledge, know-how and more experience, which translates into more efficient work and a wider variety of potential techniques to employ when tackling a given problem. Second, a senior data scientist is capable of architecting a problem solution involving considerable work that may be divided among several people and of starting a new project (e.g., based on a conversation with a client and the data that he is given).
Examples of data products that he has developed (or participated in the development of) over the years include:
Although he has been practicing in the industry for the past few years, he values the role of researchers in the field and believes that a data scientist ought to be a bridge between academia and the industry, something that he seems to have accomplished very effectively based on what he says about his life as a data scientist. Since information theory is universal, he believes that he could transition to another industry relatively easily. He finds the sectors of drug discovery and forensics particularly interesting for a data scientist today.
17.3 Data Science in the Future
Dr. Vasiloglou acknowledges the possibility of data science becoming more automated—even completely automated. Still, he sees a lot of merit in having the state-of-the-art know-how as the field is constantly evolving and will no doubt continue to do so. He also expects more programming languages, particularly functional ones (e.g., Scala), to be very popular when it comes to data science in the years to come.
17.4 Advice to New Data Scientists
Dr. Vasiloglou believes in the importance of well-founded (solid) knowledge, so he advises newcomers to study mathematics (through books, papers, courses, etc.), especially younger people who are still in college/university. He also finds merit in competitions (e.g., those in Kaggle), which he recommends for people preparing to enter the field. Such competitions offer lots of useful experience with various types of datasets and give you a chance to put into practice a variety of the data analysis techniques you have learned. He also suggests that newcomers learn software development through OO and functional programming languages. He doesn’t favor any particular language because programming skills are highly transferrable.
Dr. Vasiloglou is a champion of equilibrium when it comes to developing your data science skills. Therefore, all of the above recommendations need to be taken into account and followed in an organic and holistic way so that you end up with a balanced skill set.
17.5 Key Points