Pulse Reviews

Music Information Retrieval: Introduction, Deep Learning I & II

660 Lomita Court,Stanford CA 94305

08 August, 2022

Description

Deep Learning for Music Information Retrieval I: How Neural Networks Learn Audio (August 8-12)This workshop will cover the industry-standard methods to develop deep neural network architectures for digital audio. Throughout five immersive days of study, we will cover theoretical, mathematical, and practical principles that deep learning researchers use everyday in the real world. Deep Learning for MIR II: State-of-the-art Algorithms (August 15-19)This course is meant for individuals who want to gain experience applying Deep Learning to solve a problem of their interest in MIR. A survey of cutting-edge research in MIR using Deep Learning presented by instructors and a lineup of guest speakers leading research in industry and academia. Instructors will explain and demonstrate concepts in models that are used in cutting-edge industry and academic research. Students will tackle a real problem of their choice using deep learning models. Instructors will serve as advisors to students in the course on-demand. Students will build and train state-of-the-art models using tensorflow/pytorch and GPU computing, adapting them to a problem of their interest. Theory includes: Generative models. Self-supervised feature learning. Attention mechanisms. Models covered includes: DeepSpeech, Transformer, Crepe, GrFNNs Practice: music and speech recognition/synthesis, beat-tracking, music-recommendation, and semantic analysis. Prerequisites: Deep Learning for MIR I About the InstructorsCamille Noufi is a PhD student and researcher at the Center for Computer Research in Music and Acoustics (CCRMA) at Stanford University. Camille studies machine generation of expressive communication, and acoustic impact of the environment on the voice. Her interdisciplinary research utilizes signal processing (DSP), machine learning (ML) and human-computer-interaction (HCI) in combination with psychology and vocal science. She was a research intern in the Audio Team at Meta Reality Labs in 2020. Before coming to CCRMA, she worked on audio scene analysis and vocal biomarker research at MIT Lincoln Laboratory. Her research has been presented at the Interspeech, ISMIR, and ICML conferences. Irán R. Román is a theoretical neuroscientist and machine listening scientist at New York University’s Music and Audio Research Laboratory. Iran is a passionate instructor, with extensive experience teaching artificial intelligence and deep learning. His industry experience includes deep learning engineering internships at Plantronics in 2017, Apple in 2018 and 2019, Oscilloscape in 2020, and Tesla in 2021. Iran’s research has focused on using deep learning for speech recognition and auditory scene analysis.

By: view source

Discussion

By posting you agree to the Terms and Privacy Policy.

Search this area