07x02: Building an AI Training Data Pipeline with VAST Data

Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm

English - June 10, 2024 13:00 - 31 minutes - 35.6 MB - ★★★★★ - 1 rating
Technology Homepage Download Apple Podcasts Google Podcasts Overcast Castro Pocket Casts RSS feed

Previous Episode: 07x01: Proving the Performance of Solidigm SSDs at StorageReview

Next Episode: 07x03: Benchmarking AI Data Infrastructure with MLCommons

Model training seriously stresses data infrastructure, but preparing that data to be used is a much more difficult challenge. This episode of Utilizing Tech features Subramanian Kartik of VAST Data discussing the broad data pipeline with Jeniece Wnorowski of Solidigm and Stephen Foskett. The first step in building an AI model is collecting, organizing, tagging, and transforming data. Yet this data is spread around the organization in databases, data lakes, and unstructured repositories. The challenge of building a data pipeline is familiar to most businesses, since a similar process is required in analytics, business intelligence, observability, and simulation, but generative AI applications have an insatiable appetite for data. These applications also demand extreme levels of storage performance, and only flash SSDs can meet this demand. A side benefit is the improvements in power consumption and cooling versus hard disk drives, and this is especially true as massive SSDs come to market. Ultimately the success of generative AI will drive greater collection and processing of data on the inferencing side, perhaps at the edge, and this will drive AI data infrastructure further.

Hosts:
Stephen Foskett, Organizer of Tech Field Day: ⁠https://www.linkedin.com/in/sfoskett/⁠
Jeniece Wnorowski, Datacenter Product Marketing Manager at Solidigm: ⁠https://www.linkedin.com/in/jeniecewnorowski/

Guest:
Subramanian Kartik, Ph. D, Global Systems Engineering Lead at VAST Data: https://www.linkedin.com/in/subramanian-kartik-ph-d-1880835/

Follow Utilizing Tech
Website: ⁠⁠https://www.UtilizingTech.com/⁠⁠
X/Twitter: ⁠⁠https://www.twitter.com/UtilizingTech ⁠⁠

Tech Field Day
Website: ⁠⁠https://www.TechFieldDay.com⁠⁠
LinkedIn: ⁠⁠https://www.LinkedIn.com/company/Tech-Field-Day ⁠⁠
X/Twitter: ⁠⁠https://www.Twitter.com/TechFieldDay ⁠⁠

Tags: #UtilizingTech, #Sponsored, #AIDataInfrastructure, #AI, @SFoskett, @TechFieldDay, @UtilizingTech, @Solidigm,

07x02: Building an AI Training Data Pipeline with VAST Data

Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm

Twitter Mentions