Data Science Blog | Practical Techniques (7)

Latest

Data Exploration with Pandas Profiler and D-Tale

We all have heard how data is the new oil. I always say that if that is the case, we need to go through some refinement process before that raw oil is converted into useful...

How Atlassian Uses Data Science to Improve Collaboration

Christy Bergman, Data Scientist at Atlassian, gives an inside look at how they use data science to improve user onboarding and collaboration. This...

Data Science Practical Techniques Domino Data Popup data

Fitting Gaussian Process Models in Python

Written by Chris Fonnesbeck, Assistant Professor of Biostatistics, Vanderbilt University Medical Center. A common applied statistics task involves...

Data Science Code Model Management Machine Learning Practical Techniques Python Gaussian Process Models Model Development

Achieving Reproducibility with Conda and Domino Environments

Managing “environments” (i.e., the set of packages, configuration, etc.) is a critical capability of any Data Science Platform. Not only does...

Data Science Conda Compute Environments Domino Product Domino Practical Techniques Containers Python Reproducibility Data Science Platform

Enabling Data Science Agility with Docker

This post describes how Domino uses Docker to solve a number of interconnected problems for data scientists and researchers, related to environment...

Data Science Leaders At Work Engineering Compute Environments Practical Techniques Data Engineering Data Scientists data Data Science Platform Docker

Python 3.6 with Domino in Minutes

For Pythonistas like me, the holidays started a little early with today's release of Python 3.6. In case you haven't heard, Python 3.6 has a number...

Data Science Code Domino Product Practical Techniques Python

Python for SAS Users: The Pandas Data Analysis Library

Ths post is a chapter from Randy Betancourt's Python for SAS Users quick start guide. Randy wrote this guide to familiarize SAS users with Python and...

Data Science Code Practical Techniques Python Pandas SAS data

23 Visualizations and When to Use Them

This talk was presented live at PLOTCON 2016 in NYC on November 18, 2016. Scatterplot or bubble chart – what visualization makes the most sense for...

Data Science Visualizations Practical Techniques

Python vs. R for Data Science

R and Python are both popular open source programming languages for data scientists. Each has its advantages for performing data science tasks. So,...

Data Science Code Practical Techniques R Data Scientists Python

Exploring the Limits of Parallelized Machine Learning

This week, Domino’s Chief Data Scientist, Eduardo Ariño de la Rubia, presented a webinar: Machine Learning at Scale with Amazon's X1 Instance. If you...

Data Science Leaders At Work Amazon Machine Learning X1 Practical Techniques Webinar Data Science Leaders IT Models Machine

Gain Shell Access To Your Domino Instances

Note: Please be advised that direct access to containers via SSH has been deprecated for Domino versions above 4.x. Indirect SSH access via Workspace...

Data Science Domino Product Domino Practical Techniques Machine

How Buzzfeed Uses Real-Time Machine Learning to Choose Their Viral Content

This talk took place at the Domino Data Science Pop-up in Los Angeles, CA on September 14, 2016. In this presentation, Jane Kelly, Director of Data...

Data Science Machine Learning Practical Techniques Domino Data Popup data

Wisdom From Machine Learning at Netflix

At Data By The Bay in May, we saw a great talk by Netflix's Justin Basilico: Recommendations for Building Machine Learning Software. Justin describes...

Data Science Machine Learning Domino Practical Techniques Models data Machine

Using k-Nearest Neighbors (k-NN) in Production

What is k-Nearest Neighbors (k-NN)? k-Nearest Neighbors is a simple algorithm that stores all available cases and classifies new cases based on a...

Data Science Code Machine Learning Domino Practical Techniques Webinar R Python Models data K-NN

Choosing Content for Netflix: How Data Leads the Way

This talk took place at the Domino Data Science Pop-up in Los Angeles, CA on September 14, 2016 In this presentation, Paul Ellwood, VP of Data...

Data Science Leaders At Work Machine Learning Practical Techniques Domino Data Popup data

Using Apache Spark to Analyze Large Neuroimaging Datasets

This article was written by Sergul Aydore, Ph.D., and Syed Ashrafulla, Ph.D. Sergul and Syed received their Ph.D.s in Electrical Engineering in 2014...

Data Science Code Practical Techniques Python IT Spark

The "Joel Test" for Data Science

It's the sixteenth anniversary of Joel Spolsky's "Joel Test," which he described as a "highly irresponsible, sloppy test to rate the quality of a...

Data Science Leaders At Work Practical Techniques Data Scientists Best Practices

Load More Posts

Subscribe to the Data Science Blog