talk-data.com talk-data.com

Event

September Monthly CASSUG Meeting

2023-09-11 – 2023-09-11 Meetup Visit website ↗

Activities tracked

0

Greetings, data enthusiasts!

Our September meeting is scheduled for Monday, September 11, at 5:30 pm! We will meet in person at: Rensselaer Chamber of Commerce, 90 4th Street, Troy, NY.

Please RSVP if you are attending, so we can purchase an appropriate amount of food for everyone. Food is TBA, but since John (our speaker) is driving out from New England, I am letting him pick the cuisine. Can he say "Chowder"? We shall see :-)

Our meeting schedule is as follows:

  • 5:30 PM: Food, soft drinks, and networking
  • 6:15 PM: Chapter news and announcements
  • 6:30 PM: Presentation

We usually wrap up between 7:30 PM and 8:00 PM.

This month, we welcome John Miner to provide us an introductory presentation on Spark SQL. Here is the session info:

This presentation is a crash course covering the basics of Spark SQL for the Microsoft T-SQL Server developer.

Azure Databricks is a managed service which provides the latest versions of Apache Spark based upon open source libraries. Spin up clusters and build quickly in a fully managed environment with the global scale and availability of Microsoft Azure.

The Adventure Works database is provided as raw delimited files to transform. We will go over read and writing files to popular file formats using PySpark, a Python-based wrapper for the Scala API. The real power of PySpark is the ability to read a file into a data frame and abstract the contents of the file as a temporary view during processing. Optionally, the raw data files can be presented as tables in the hive catalog. Once this abstraction is complete, all the SQL skills that you have obtained over the years can be used to transform the views/tables in the hive catalog into refined data in the data lake.

Sessions & talks

Showing 1–0 of 0 · Newest first

Search within this event →

No individual activities are attached to this event yet.