Skip to content
    Latest

    Data Exploration with Pandas Profiler and D-Tale

    We all have heard how data is the new oil. I always say that if that is the case, we need to go through some refinement process before that raw oil is converted into useful...

    Reproducible Research in Computational Sciences

    This guest post was written by Arnu Pretorius, a Masters student in Mathematical Statistics at the MIH Media Lab, Stellenbosch University. Arnu's...

    Model-Based Machine Learning and Probabilistic Programming in RStan

    In this recorded webcast, Daniel Emaasit introduces model-based machine learning and related concepts, practices and tools such as Bayes' Theorem,...

    An Introduction to Model-Based Machine Learning

    This guest post was written by Daniel Emaasit, a Ph.D. Student of Transportation Engineering at the University of Nevada, Las Vegas. Daniel's...

    Providing Digital Provenance: from Modeling through Production

    At last week's useR! R User conference, I spoke on digital provenance, the importance of reproducible research, and how Domino has solved many of the...

    Announcing Enhanced Apache Spark Support

    Domino now offers data scientists a simple, yet incredibly powerful way to conduct quantitative work using Apache Spark. Apache Spark has captured...

    Orchestrating Pipelines with Luigi and Domino

    Building a data pipeline may sound like a daunting task. In this post, we will examine how you can use Luigi - a library specifically designed to...

    Ugly Little Bits of the Data Science Process

    This morning there was a great conversation on Twitter, kicked off by Hadley Wickham, about one of the ugly little bits of the data science process.

    Moving Academic Researchers into a Corporate Data Science World

    This webinar was recorded on May 25, 2016 There are big cultural differences between data science practices in academic and corporate settings. For...

    Better Knowledge Management for Data Science Teams

    We’re excited to announce a set of big new features that make it easier for you to find and reuse past data science work in your team and...

    “Unit testing” for data science

    An interesting topic we often hear data science organizations talk about is “unit testing.” It’s a longstanding best practice for building software,...

    Making Data Science Fast: Survey of GPU Accelerated Tools

    This talk took place at the Domino Data Science Pop-up in Austin, TX on April 13, 2016 In this talk, Mazhar Memon, CEO and Co-founder at...

    The R Data I/O Shootout

    We pit newcomer R data I/O package, feather, against popular packages data.table, readr, and the venerable saveRDS/writeRDS functions from base R....

    How Machine Learning Amplifies Inequality in Society

    This talk took place at the Domino Data Science Pop-up in Austin, TX on April 13, 2016 In this talk, Mike Williams, Research Engineer at Fast...

    Visualizing Machine Learning with Plotly and Domino

    This post was contributed by Chelsea Douglas, a Software Engineer at Plotly. I recently had the chance to team up with Domino Data Lab to produce a...

    Uber and the Need for a Data Science Platform

    For those wondering if data science platforms are really a thing, there’s a great article by Kevin Novak, the head of Uber’s Data Science Platform...

    The Real Value of Containers for Data Science

    Every year, $50 of your taxes is invested in research that can't be reproduced.Erik Andrejko, VP Science, The Climate Corporation, speaking at...

    Subscribe to the Data Science Blog

    Receive data science tips and tutorials from leading Data Scientists right to your inbox.