Skip to main content

Master’s of Data Science


Play Full Video

Take Your Career to the Next Level

Data science is expected to remain among the fastest-growing fields in the world, with expanding applications across academia, government, healthcare, and nonprofits.3 The unique MSDS curriculum integrates approaches and techniques from statistics and computer science, giving MSDS graduates a wide range of tools to tackle any data challenge.

bar title=

Gain Foundational Knowledge for Wide Industry Applications

light bulb

Graduate from a 
Top-Ranked Public University4


Affordable, Advanced Degree 
Priced at $10,000+ Fees5


The online DS master's degree program is offered jointly by UT Austin’s Department of Statistics and Data Sciences and Department of Computer Science. The MSDS program integrates advanced approaches, techniques, and skills from statistics and computer science. The curriculum allows you to develop a uniquely broad spectrum of tools and approaches for understanding, analyzing, and modeling data.

With tenured teaching faculty from both sponsoring departments, the MSDS program will teach you advanced approaches, techniques and skills across the fields of statistics and computer science. Courses cover probability and simulation, regression analysis, data visualization; and computer science topics such as machine learning, data structures, and optimization, and much more. MSDS students graduate with a strong foundation in data analysis along with applied training in machine learning and other computational approaches to data.

3Bureau of Labor Statistics, Computer and Information Research Scientists, Job Outlook 2021-2031.    
4Top Public Schools, U.S. News & World Report, Ranked 2022-2023.    
5International student fees and late registration fees may apply.


required courses

three foundational courses


elective courses

seven additional courses


ten courses

Ten Courses

The online master’s degree in data science is a 30-hour program consisting of nine hours of Foundational courses and 21 hours of additional required courses and elective courses. Each course counts for three credit hours and you must take a total of 10 courses to graduate. While it is not required, it is recommended that MSDS students first complete the three required Foundational courses listed below before completing additional required courses and elective courses.

Foundational Courses

In this course, students will develop their programming skills while learning the fundamentals of data structures and algorithms. Students will hone their programming skills by completing non-trivial programming assignments in Python, and they will be introduced to important programming methodologies and skills, including testing and debugging. Students will learn a variety of data structures, from the basics, such as stacks, queues, and hash tables, to more sophisticated data structures such as balanced trees and graphs. In terms of algorithms, the focus will be on the practical use and analysis of algorithms rather than on proof techniques.


  • Programming Skills: Testing
  • Programming Skills: Debugging
  • Programming Skills: Programming Methodology
  • Data Structures: Stacks, Queues
  • Data Structures: Linked lists
  • Data Structures: Hash Tables
  • Data Structures: Trees
  • Data Structures: Balanced Trees
  • Data Structures: Binary Heaps
  • Data Structures: Graphs
  • Algorithms: Algorithm Analysis
  • Algorithms: Searching and Sorting
  • Algorithms: Divide and Conquer Algorithms
  • Algorithms: Greedy Algorithms
  • Algorithms: Dynamic Programming
Calvin Lin

Calvin Lin

Professor, Computer Science

Probability and Simulation Based Inference for Data Science is a statistics-based course necessary for developing core skills in data science and for basic understanding of regression-based modeling. Students can look forward to gaining a foundational knowledge of inference through the simulation process.

What You Will Learn

  • Definition of probabilities and probability calculus
  • Random variables, probability functions and densities
  • Useful inequalities
  • Sampling distributions of statistics and confidence intervals for parameters
  • Hypothesis testing
  • Introduction to estimation theory (Properties of estimators, maximum likelihood estimation, exponential families)


  • Events and probability (1 week)
  • Random variables (1 week)
  • Moments and inequalities (1 week)
  • Continuous random variables (1 week)
  • Normal distribution and the central limit theorem (1 week)
  • Sampling distributions of statistics and confidence intervals (1.5 weeks)
  • Hypothesis testing (2 weeks)
  • Introduction to Estimation Theory (1.5 weeks)
Mary Parker

Mary Parker

Associate Professor, Statistics and Data Sciences

Foundations of Regression and Predictive Modeling is designed to introduce students to the fundamental concepts behind the relationships between variables, more commonly known as regression modeling. Learners will be exposed not only to a theoretical background of the regression models, but all models will be extensively demonstrated using regression. The emphasis throughout will be on hypothesis testing, model selection, goodness of fit, and prediction.

What You Will Learn

  • Learn the key ideas behind regression models.
  • Apply the ideas and analysis to various types of regression model.
  • Understand and interpret the output from an analysis.
  • Understand key procedures such as hypothesis testing, prediction, and Bayesian methods.


  • Foundations and Ideas, Simple Linear Model, Correlation; Estimation; Testing.
  • Multiple Linear Regression, Vector and matrix notation; Colinearity; Ridge regression.
  • Bayes Linear Model; Conjugate model; Prior to posterior analysis; Bayes factor.
  • Variable Selection, LASSO, Principal component analysis; Bayesian methods.
  • ANOVA Models, One-way ANOVA; Two-way ANOVA; ANOVA Table; F-tests.
  • Moderation & Interaction, Testing for interaction; Sobel test.
  • Nonlinear Regression, Iterative estimation algorithms; Bootstrap.
  • Poisson regression, Analysis of count data, Weighted linear model.
  • Generalized Linear Model, Exponential family; GLM theory; Logistic regression.
  • Nonparametric Regression, Kernel smoothing; Splines; Regression trees.
  • Mixed Effects Model, Fixed and random effects; EM algorithm; Gibbs sampler.
  • Multiclass Regression, Classification tree; Multinomial logistic regression.
Stephen Walker

Stephen Walker

Professor, Mathematics & Statistics and Data Sciences

Additional Required Courses

This class covers advanced topics in deep learning, ranging from optimization to computer vision, computer graphics and unsupervised feature learning, and touches on deep language models, as well as deep learning for games.

Part 1 covers the basic building blocks and intuitions behind designing, training, tuning, and monitoring of deep networks. The class covers both the theory of deep learning, as well as hands-on implementation sessions in pytorch. In the homework assignments, we will develop a vision system for a racing simulator, SuperTuxKart, from scratch.

Part 2 covers a series of application areas of deep networks in: computer vision, sequence modeling in natural language processing, deep reinforcement learning, generative modeling, and adversarial learning. In the homework assignments, we develop a vision system and racing agent for a racing simulator, SuperTuxKart, from scratch.

What You Will Learn

  • About the inner workings of deep networks and computer vision models
  • How to design, train and debug deep networks in pytorch
  • How to design and understand sequence
  • How to use deep networks to control a simple sensory motor agent


  • Background
  • First Example
  • Deep Networks
  • Convolutional Networks
  • Making it Work
  • Computer Vision
  • Sequence Modeling
  • Reinforcement Learning
  • Special Topics
  • Summary
Philipp Krähenbühl

Philipp Krähenbühl

Assistant Professor, Computer Science

This course focuses on core algorithmic and statistical concepts in machine learning.

Tools from machine learning are now ubiquitous in the sciences with applications in engineering, computer vision, and biology, among others. This class introduces the fundamental mathematical models, algorithms, and statistical tools needed to perform core tasks in machine learning. Applications of these ideas are illustrated using programming examples on various data sets.

Topics include pattern recognition, PAC learning, overfitting, decision trees, classification, linear regression, logistic regression, gradient descent, feature projection, dimensionality reduction, maximum likelihood, Bayesian methods, and neural networks.

What You Will Learn

  • Techniques for supervised learning including classification and regression
  • Algorithms for unsupervised learning including feature extraction
  • Statistical methods for interpreting models generated by learning algorithms


  • Mistake Bounded Learning (1 week)
  • Decision Trees; PAC Learning (1 week)
  • Cross Validation; VC Dimension; Perceptron (1 week)
  • Linear Regression; Gradient Descent (1 week)
  • Boosting (.5 week)
  • PCA; SVD (1.5 weeks)
  • Maximum likelihood estimation (1 week)
  • Bayesian inference (1 week)
  • K-means and EM (1-1.5 week)
  • Multivariate models and graphical models (1-1.5 week)
  • Neural networks; generative adversarial networks (GAN) (1-1.5 weeks)
Adam Klivans

Adam Klivans

Professor, Computer Science

Qiang Liu

Qiang Liu

Assistant Professor, Computer Science

Elective Courses 1 (complete 3 of 4)

This course is designed to extend foundational knowledge in probability, statistical inference, and regression analysis to settings involving complex data structures. The first part of the course focuses on statistical analysis of time series and spatial data. Topics related to time-series analysis include autocorrelation, classical time-series models, state-space models, and hidden Markov models. Topics related to spatial statistics include the analysis of spatial point patterns, Gaussian processes with spatial correlation functions and prediction/kriging, and spatial autoregressive models. A primary goal is for learners to recognize settings when standard statistical methods based on assumptions of independence are inappropriate. In addition, learners will gain awareness of and experience in applying modern statistical methods for the analysis of dependent data. Applications in areas ranging from finance to ecology to public health will be emphasized.

The second part of the course focuses on inference for matrix structured data, including methods for matrix completion and denoising, clustering, network models and inference, random walks on graphs, and graph representation learning. For matrix completion, topics include the singular value decomposition, non-negative matrix factorization, and iterative optimization methods. For network models, topics include the stochastic blockmodel and its degree corrected and mixed membership variants, latent distance models, etc. Learners will be introduced to spectral methods, Bayesian methods, convex relaxation, and likelihood-based methods for inference. Learners will also gain awareness for how different sparsity regimes call for different types of regularization in spectral approaches. Random walks on graphs will be introduced as a tool for graph representation learning.

What You Will Learn

  • How to identify different types of dependencies in structured data
  • How to apply appropriate models for dependent data
  • How to make forecasts/predictions that account for dependence and structure
  • How to identify latent structure in complex data


  • Classical Time-series Models
  • State-Space Models
  • Hidden Markov Models
  • Methods for Detecting Spatial Clustering
  • Gaussian Process Models for Spatial Prediction
  • Spatial Autoregressive Models
  • PCA and SVD
  • Methods for Matrix Completion and Denoising
  • Clustering
Catherine Calder

Catherine Calder

Professor, Chair, Statistics and Data Sciences

Purnamrita Sarkar

Purnamrita Sarkar

Assistant Professor, Statistics and Data Sciences

In Data Exploration, Visualization, and Foundations of Unsupervised Learning, students will learn how to visualize data sets and how to reason about and communicate with data visualizations. Students will also learn how to assess data quality and provenance, how to compile analyses and visualizations into reports, and how to make the reports reproducible. A substantial component of this class will be dedicated to learning how to program in R.

What You Will Learn

  • Data visualization
  • R programming
  • Reproducibility
  • Data quality and relevance
  • Data ethics and provenance
  • Dimension reduction
  • Clustering


  • Introduction, reproducible workflows
  • Aesthetic mappings
  • Telling a story
  • Visualizing amounts
  • Coordinate systems and axes
  • Visualizing distributions I
  • Visualizing distributions II
  • Color scales
  • Data wrangling 1
  • Data wrangling 2
  • Visualizing proportions
  • Getting to know your data 1: Data provenance
  • Getting to know your data 2: Data quality and relevance
  • Getting things into the right order
  • Figure design
  • Color spaces, color vision deficiency
  • Functions and functional programming
  • Visualizing trends
  • Working with models
  • Visualizing uncertainty
  • Dimension reduction 1
  • Dimension reduction 2
  • Clustering 1
  • Clustering 2
  • Data ethics
  • Visualizing geospatial data
  • Redundant coding, text annotations
  • Interactive plots
  • Over-plotting
  • Compound figures
Claus O. Wilke

Claus O. Wilke

Professor and Chair, Integrative Biology

This course will introduce students to statistical methods used in health data science. Topics will include survival analysis, prediction, longitudinal data analysis, design and analysis of observational studies including propensity score analysis, and design and analysis of randomized studies including sample size and power calculations, intent-to-treat analysis, and noncompliance.

What You Will Learn

  • How to identify the appropriate statistical method to be used to test or explore a hypothesis
  • How to explain and assess the appropriateness of assumptions needed in order to apply a particular method
  • How to analyze survival data, longitudinal data, and observational and clinical trial data and interpret results
  • How to critically identify potential sources of bias e.g. unmeasured confounding, unknown dependence between subjects, dependent censoring


  • Introduction to Survival Analysis (1 week)
  • Nonparametric and Parametric estimation in survival (2 weeks)
  • Cox regression, model diagnostics (1.5 weeks)
  • Prediction and cross-validation (2.5 weeks)
  • Longitudinal data visualization (1 week)
  • Longitudinal data analysis (2 weeks)
  • Observational studies, causal inference, propensity scores (2.5 weeks)
  • Randomized trials, power, noncompliance (2.5 weeks)
Layla Parast

Layla Parast

Associate Professor, Statistics and Data Sciences

While much of statistics and data sciences is framed around problems of prediction, Design Principles and Causal Inference for Data-Based Decision Making will cover basic concepts of statistical methods for inferring causal relationships from data, with a perspective rooted in a potential-outcomes framework.  Issues such as randomized trials, observational studies, confounding, selection bias, and internal/external validity will be covered in the context of standard and non-typical data structures. The overall goal of the course is to train learners on how to formally frame questions of causal inference, give an overview of basic methodological tools to answer such questions and, importantly, provide a framework for interrogating the causal validity of relationships learned from data.  The target audience for this course is someone with basic statistical skills who seeks training on how to use data to characterize the consequences of well-defined actions or decisions.

What You Will Learn

  • How to formalize causality with observed data
  • Common threats to causal validity
  • Non-typical data structures
  • Novel design strategies
  • Causal inference


  • What is “causal inference”
  • Potential outcomes
  • Regression
  • Standardization
  • Matching designs
  • Quasi-experimental designs
Corwin Zigler

Corwin Zigler

Associate Professor, Statistics and Data Sciences

Elective Courses 2 (complete 2 of 3)

This course focuses on modern natural language processing using statistical methods and deep learning. Problems addressed include syntactic and semantic analysis of text as well as applications such as sentiment analysis, question answering, and machine translation. Machine learning concepts covered include binary and multiclass classification, sequence tagging, feedforward, recurrent, and self-attentive neural networks, and pre-training / transfer learning.

What You Will Learn

  • Linguistics fundamentals: syntax, lexical and distributional semantics, compositional semantics
  • Machine learning models for NLP: classifiers, sequence taggers, deep learning models
  • Knowledge of how to apply ML techniques to real NLP tasks


  • ML fundamentals, linear classification, sentiment analysis (1.5 weeks)
  • Neural classification and word embeddings (1 week)
  • RNNs, language modeling, and pre-training basics (1 week)
  • Tagging with sequence models: Hidden Markov Models and Conditional Random Fields (1 week)
  • Syntactic parsing: constituency and dependency parsing, models, and inference (1.5 weeks)
  • Language modeling revisited (1 week)
  • Question answering and semantics (1.5 weeks)
  • Machine translation (1.5 weeks)
  • BERT and modern pre-training (1 week)
  • Applications: summarization, dialogue, etc. (1-1.5 weeks)
Greg Durrett

Greg Durrett

Assistant Professor, Computer Science

This class covers linear programming and convex optimization. These are fundamental conceptual and algorithmic building blocks for applications across science and engineering. Indeed any time a problem can be cast as one of maximizing / minimizing and objective subject to constraints, the next step is to use a method from linear or convex optimization. Covered topics include formulation and geometry of LPs, duality and min-max, primal and dual algorithms for solving LPs, Second-order cone programming (SOCP) and semidefinite programming (SDP), unconstrained convex optimization and its algorithms: gradient descent and the newton method, constrained convex optimization, duality, variants of gradient descent (stochastic, subgradient etc.) and their rates of convergence, momentum methods.


  • Convex sets, convex functions, Convex Programs (1 week)
  • Linear Programs (LPs), Geometry of LPs, Duality in LPs (1 week)
  • Weak duality, Strong duality, Complementary slackness (1 week)
  • LP duality: Robust Linear Programming, Two person 0-sum games, Max-flow min-cut (1 week)
  • Semidefinite programming, Duality in convex programs, Strong duality (1 week)
  • Duality and Sensitivity, KKT Conditions, Convex Duality Examples: Maximum Entropy (1 week)
  • Convex Duality: SVMs and the Kernel Trick, Convex conjugates, Gradient descent (1 week)
  • Line search, Gradient Descent: Convergence rate and step size, Gradient descent and strong convexity (1 week)
  • Frank Wolfe method, Coordinate descent, Subgradients (1 week)
  • Subgradient descent, Proximal gradient descent, Newton method (1 week)
  • Newton method convergence, Quasi-newton methods, Barrier method (1 week)
  • Accelerated Gradient descent, Stochastic gradient descent (SGD), Mini-batch SGD, Variance reduction in SGD (1 week)
Sujay Sanghavi

Sujay Sanghavi

Associate Professor, Electrical and Computer Engineering

Constantine Caramanis

Constantine Caramanis

Professor, Electrical & Computer Engineering

This course introduces the theory and practice of modern reinforcement learning. Reinforcement learning problems involve learning what to do—how to map situations to actions—so as to maximize a numerical reward signal. The course will cover model-free and model-based reinforcement learning methods, especially those based on temporal difference learning and policy gradient algorithms. Introduces the theory and practice of modern reinforcement learning. Reinforcement learning problems involve learning what to do—how to map situations to actions—so as to maximize a numerical reward signal. The course will cover model-free and model-based reinforcement learning methods, especially those based on temporal difference learning and policy gradient algorithms. It covers the essentials of reinforcement learning (RL) theory and how to apply it to real-world sequential decision problems. Reinforcement learning is an essential part of fields ranging from modern robotics to game-playing (e.g. Poker, Go, and Starcraft). The material covered in this class will provide an understanding of the core fundamentals of reinforcement learning, preparing students to apply it to problems of their choosing, as well as allowing them to understand modern RL research. Professors Peter Stone and Scott Niekum are active reinforcement learning researchers and bring their expertise and excitement for RL to the class.

What You Will Learn

  • Fundamental reinforcement learning theory and how to apply it to real-world problems
  • Techniques for evaluating policies and learning optimal policies in sequential decision problems
  • The differences and tradeoffs between value function, policy search, and actor-critic methods in reinforcement learning
  • When and how to apply model-based vs. model-free learning methods
  • Approaches for balancing exploration and exploitation during learning
  • How to learn from both on-policy and off-policy data


  • Multi-Armed Bandits
  • Finite Markov Decision Processes
  • Dynamic Programming
  • Monte Carlo Methods
  • Temporal-Difference Learning
  • n-step Bootstrapping
  • Planning and Learning
  • On-Policy Prediction with Approximation
  • On-Policy Control with Approximation
  • Off-Policy Methods with Approximation
  • Eligibility Traces
  • Policy Gradient Methods
Peter Stone

Peter Stone

Professor, Computer Science

Scott Niekum

Scott Niekum

Associate Professor, Computer Science

Important Dates

Fall Application

Spring Application

Please note: Applying to UT Austin is a twofold process. We recommend that applicants apply to UT Austin before the priority deadline. This is to ensure their materials are processed in a timely manner.