Big Data – What is a data scientist?

A data scientist is a professional with training in computer science, math, and statistics able to make discoveries,  spot trends, make predictions, and extract knowledge from data. In 2012, Harvard Business Review named data scientist the “sexiest job of the 21st century”. According to a recent report of Glassdoor (a job search engine), data scientist is identified as the No.1 job among the 25 Best Jobs in America for 2016.

A data scientist working at Uber (https://www.kaggle.com/jobs/17102/uber-data-scientist-san-francisco) may be involved in

  • Building a data-driven model for Uber’s dynamic pricing engine;
  • Identifying and predicting city specific traffic, travel, and demand patterns;
  • Optimizing the assignment process of matching/dispatching drivers to riders based on rider and traffic data;
  • Using data to understand and predict user behavior for riders and drivers

Skills and Knowledge a Data Scientist needs to have

  • Programming fundamentals, data structures and algorithms  
    • Know programming languages that are good at statistical analysis, such as Python and R; be familiar with common data structures and algorithms; and have knowledge of  programming languages like Java and C.
  • Math & Statistics
    • Knowing statistics and probabilities theories is essential for a data scientist. Understanding basic statistical terms like p-value, distributions, t-tests, and regressions is fundamental and necessary.
    • Calculus and Linear Algebra
  • Machine Learning
    • As machine learning plays an important role in the analysis of big data, a data scientist needs to be familiar with various machine learning algorithm and relevant R or Python libraries.
  • Data Management
    • Familiarity with the use of databases including query languages like SQL, distributed data management systems like Hadoop, data processing techniques like MapReduce.
  • Data Visualization
    • Visualizing data in an understandable way and communicating data analysis results in plain language are important skills of data scientists.
  • Effective Communication
    • Have excellent communication skills (oral and written) and the ability to communicate with people working in various domains

More information: http://blog.udacity.com/2014/11/data-science-job-skills.html

Undergraduate Programs related to Data Science

A few schools offer newly created  undergraduate degrees in Data Science. In addition, degrees in computer science often offer students a specialization in data science. Other majors may have the opportunity to pick up a data science related minor. Some schools allow student to customize a data science degree. We list courses that are considered foundational to a good education in data science.