Page not found. Your pixels are in another canvas.
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
This is a page not in th emain menu
In this blogpost I explain how masking works in sound source separation. It adresses binary mask and complex masks. An ablation study on their performance for the two-sources case is carried out.
Goal: Audio-visual sound source separation
- Only the Source separation functions has been ported, even though the project is opensource and open to grow.
Tutorial: Loading AudioVisual Content with Nvidia DALI
Goal: Weakly supervised pose detection
Resume: Self-supervised Learning of Interpretable Keypoints from Unlabelled Videos
Goal:Aligning two mages depicting objects of the same category
Let’s show how to preprocess audio data to be used in Deep Learning. To do so we are going to use two very standard libraries.
Short description of portfolio item number 1
Short description of portfolio item number 2
Published in IEEE MMSP2020, 2020
Audiovisual Dataset of musicians playing different instruments. Openpose skeleton provided framewise
Recommended citation: Juan F. Montesinos, Olga Slizovskaia, Gloria Haro (2020). "Solos: A Dataset for Audio-Visual Music Source Separation and Localization" IEEE MMSP 2020 1. https://arxiv.org/pdf/2006.07931.pdf
Published in IEEE MMSP 2020, 2020
Weighted losses applied to a Multi-channel U-Net
Recommended citation: Venkatesh Shenoy, Juan F. Montesinos, Gloria Haro, Emilia Gómez (2020). "Multi-channel U-Net for Music Source Separation." IEEE MMSP2020 1. https://arxiv.org/pdf/2003.10414.pdf
Published in Under review, 2021
Audiovisual model and dataset for singing voice separation
Recommended citation: Juan F. Montesinos, Venkatesh S. Kadandale, Gloria Haro (2021). "A cappella: Audio-visual Singing Voice Separation" https://arxiv.org/pdf/2006.3708242.pdf
This is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
Undergraduate course, Pompeu Fabra University, DTIC, 2019
Degree's Dissertation, Pompeu Fabra University, DTIC, 2019
Resume EN: This project proposes a method for the task of audio source separation of a signal, based on the movements of the players related to that signal. The process is composed of three blocks. The first block, computes a frequential analysis of the original signal by Non- negative Matrix Factorization (NMF). The video processing block estimates the velocity signal of the movements of each player by two types of video segmentation: the first one is based on motion trajectories of the objects in the scene, while the second one, uses optical flow and Principal Component Analysis. The last processing block makes a cor- relation between the frequential information and the velocity signals, using four variation of a method based on NMF and Non-Negative Least Squares. Finally, some experiments show the efficacy of the different variants of the audio source separation method.