Getting Started with Data Science and Databricks


Databricks is a very popular environment for developing data science solutions. More and more companies are interested in Databricks as it is very simple to set up and contains a collaborative workspace for working with a team of people. In this session you will see how to create Machine Learning Solutions with multiple workflows starting with ETL, to data exploration, model experimentation, and lastly to a production release of a data science solution. Today, more and more development is performed on very large datasets. Attendees will learn how to use Apache Spark, which is part of Databricks, to rapidly analyze lots of data. Learn how to use Databricks to reduces operational complexity to create solutions which can be scaled up or down depending on the amount of data needed to process without having to change the underlying code. Python, Jupyter Notebooks, and Apache Spark are the technologies used to create solutions within this session. No experience is required.


  • Ginger Grant

    1 Recording

    Ginger Grant manages the consultancy Desert Isle Group and shares what she has learned while working with data technology to people around the world. Last year she co-authored the book Exam Ref 70-774 Perform Cloud Data Science with Azure Machine Learning and has recently released an online class at Datacamp on Intermediate T-SQL. As a Microsoft MVP in Data Platform, Microsoft Certified Trainer and an Idera ACE, she focuses on guiding clients to create solutions using the entire Microsoft Data Stack, which includes SQL Server, Power BI, and Azure Data Cloud components. When not working, she protypes the latest pre-release data technologies, maintains her blog, and spends time on twitter @desertislesql.

Recorded At:

Recorded on:

Feb 8, 2020

More Info: