Music Source Separation with Generative Adversarial Network and Waveform Averaging

Ryosuke Tanabe, Yuto Ichikawa, Takanori Fujisawa, Masaaki Ikehara

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The task of music source separation is to extract a target sound from mixed sound. A popular approach for this task uses a DNN which learns the relationship of the spectrum of mixed sound and one of separated sound. However, many DNN algorithms does not consider the clearness of the output sound, this tends to produce artifact in the output spectrum. We adopt a generative adversarial network (GAN) to improve the clearness of the separated sound. In addition, we propose data augmentation by pitch-shift. The performance of DNN strongly depends on the quantity of the dataset for train. In other words, the limited kinds of the training datasets gives poor knowledge for the unknown sound sources. Learning the pitch-shifted signal can compensate the kinds of training set and makes the network robust to estimate the sound spectrum with various pitches. Furthermore, we process the pitch-shifted signals and average them to reduce artifacts. This proposal is based on the idea that network once learned can also separate pitch-shifted sound sources not only original one. Compared with the conventional method, our method achieves to obtain well-separated signal with smaller artifacts.

Original languageEnglish
Title of host publicationConference Record - 53rd Asilomar Conference on Circuits, Systems and Computers, ACSSC 2019
EditorsMichael B. Matthews
PublisherIEEE Computer Society
Pages1796-1800
Number of pages5
ISBN (Electronic)9781728143002
DOIs
Publication statusPublished - 2019 Nov
Event53rd Asilomar Conference on Circuits, Systems and Computers, ACSSC 2019 - Pacific Grove, United States
Duration: 2019 Nov 32019 Nov 6

Publication series

NameConference Record - Asilomar Conference on Signals, Systems and Computers
Volume2019-November
ISSN (Print)1058-6393

Conference

Conference53rd Asilomar Conference on Circuits, Systems and Computers, ACSSC 2019
CountryUnited States
CityPacific Grove
Period19/11/319/11/6

    Fingerprint

ASJC Scopus subject areas

  • Signal Processing
  • Computer Networks and Communications

Cite this

Tanabe, R., Ichikawa, Y., Fujisawa, T., & Ikehara, M. (2019). Music Source Separation with Generative Adversarial Network and Waveform Averaging. In M. B. Matthews (Ed.), Conference Record - 53rd Asilomar Conference on Circuits, Systems and Computers, ACSSC 2019 (pp. 1796-1800). [9048852] (Conference Record - Asilomar Conference on Signals, Systems and Computers; Vol. 2019-November). IEEE Computer Society. https://doi.org/10.1109/IEEECONF44664.2019.9048852