Data engineering has historically involved extracting data from disperate sources, transforming it to a standard layout, and then loading it into a new database for analytics. Usually these data engineering pipeline jobs would run on a schedule such as nightly or weekly. In today's fastpaced high-tech world however the need for data closer to real-time, meaning when it was first generated, is higher than ever. In today's episode we hear from Dustin Vannoy who is a consultant and blogger in the streaming data space about how to use Apache Spark, the most popular streaming analytics platform.




How to connect with Dustin:


- WEBSITE: https://dustinvannoy.com/


- TWITTER: https://twitter.com/dustinvannoy


- LINKEDIN: https://www.linkedin.com/in/dustinvannoy/


- YOUTUBE: https://www.youtube.com/channel/UCYdC0t9EFtyVAs0-cwqVCTw/videos




Learn data skills at our academy and elevate your career. Start for free at https://ftdacademy.com/pod



---

Send in a voice message: https://podcasters.spotify.com/pod/show/ftdacademy/message

Twitter Mentions