Links available, click to see corresponding paper & code repository.
Learning Approximate Stochastic Transition Models
Yuhang Song, Christopher Grimm, Xianming Wang, Michael Littman.
Summable Reparameterizations of Wasserstein Critics in the One-Dimensional Setting.
Christopher Grimm, Yuhang Song, Michael Littman.
Modelling Attention in Panoramic Video: A Reinforcement Learning Approach
Yuhang Song†, Mai Xu∗†, Minglang Qiao, Jianyi Wang, Liangyu Huo. Submitted to TPAMI.
Generalization Tower Network: A Novel Deep Neural Network Architecture for Multi-Task Learning.
Yuhang Song, Mai Xu∗, Songyang Zhang. Submitted to ICML 2018.
Watching Videos with Certain and Constant Quality: PID-based Quality Control Method
Yuhang Song, Mai Xu∗, Shengxi Li. Published in DCC 2017.
Brief introduction to me research experiences. Links available, click to see the corresponding paper.
Modelling Stochastic Transition with a Novel GAN. | As 1st author. | Supervised directly by Prof. Michael Littman
Main works: We show that currently popular GANs struggle to learn stochastic transitions in model-based RL with closely matched probability distribution. In response, we propose a novel GAN, namely SGAN, accomplished by a modification to the loss of the discriminator in traditional GAN’s paradigm. We propose the optimal SGAN we are expecting and give 3 pages’ theoretical proof to show how the proposed algorithm can achieve this optimal SGAN. In experiments, SGAN advances multiple domains (including a real-world domain) significantly.
Summable Reparameterizations of WGAN Critics. | As 2nd author. | Supervised directly by Prof. Michael Littman.
Main works: We identify a class of function decompositions with properties that make them well suited to the critic role in a leading approach to GANs known as Wasserstein GANs. We show that Taylor and Fourier series decompositions belong to our class, provide examples of these critics outperforming standard GAN approaches. We show that this reparameterized critic performs better than standard gradient-penalty wGAN ap- proaches on a set of one-dimensional simulated domains.
Modelling Attention in Panoramic Video with RL. | As 1st author. | Supervised directly by Prof. Mai Xu.
Main works: We establish a database collecting subjectsąŕ head movement (HM) positions on panoramic video sequences, and we find from our database that the HM data are highly consistent across subjects. We further find that deep reinforcement learning (DRL) can be applied in predicting HM positions, seen as actions of an agent. Based on our findings, we propose a DRL based HM Prediction (DHP) approach in offline and online versions, called offline-DHP and online-DHP, respectively. Experimental results validate that offline-DHP and online-DHP are effective in predicting HM positions of panoramic video in offline and online manners, respectively. Experimental results also show that the learned offline-DHP model is capable of improving the performance of online-DHP.
A Novel Network Architecture for Multi-Task RL. | As 1st author. | Supervised directly by Prof. Mai Xu.
Main works: We conduct a case study on human with similar games from Atari, and found the existence of hierarchical shared knowledges across similar tasks. Inspired by how human develop hierarchical knowledges, we propose a novel deep network, namely Generalization Tower Network (GTN), enabling task-label-free multi-task RL within a single model. The main novelty of GTN is to introduce vertical streams, the effectiveness of which is validated by Fisher Sensitivity (FS) analysis. Experimental results verify that our GTN architecture is able to advance the state-of-the-art multi-task RL, via being tested on 51 Atari games.
Deep Exploration via Potential Learning. | As 1st author. | Supervised directly by Prof. Mai Xu.
Main works: We proposed a dual MDPs structure to estimate the "learning potential" of an exploration noise. Accordingly, an algorithm is proposed to achieve deep exploration via modeling the "learning potential". Experimental results advance the state-of-the-art on part (65%) of the continuous control tasks from MuJoCo.
Model-free Video Quality Control. | As 1st author. | Supervised directly by Prof. Mai Xu.
Main works: We focus on a challenging problem of model-free video quality control, and propose a novel PID-based model-free quality control (PQC) method for video coding. We proposed to apply the Laplace domain analysis to model the relationship between quantization parameter (QP) and control error in this formulation. Experimental results show that our PQC method is effective in both control accuracy and quality fluctuation.
I am working with two really cool professors (being co-author in multiple papers). Click and checkout their google scholar!
Prof. Michael Littman
Research directions with this professor: GAN in RL, learning closely matched probability distributions with GAN and advancing model-based RL, summable reparameterization for WGANs.
Prof. Mai Xu
Research directions with this professor: RL and Applications in Computer Vision, modelling attention in panoramic video with RL based approaches, multi-task RL with diverse representations, model-free video coding.
iDrone: A Modular Reconfigurable Drone System | As 1st author.
I spent two years focusing on this project, from innovating the original idea to designing the basic model. Eventually, I built a team of 10 members from diverse backgrounds which we worked together through the project. Although we decided not to accept venture investment from a few companies, yet it represents the outstanding level of our product. Really cool staff, you have to see it!
A list of scholarships I have ever been awarded.
National Innovation Scholarship | 9 among 200,000+ | 1st Prize
Offer highest bonus among all the scholarships in our university.
Airbus Academic Scholarship | 2 among 6,000+ | 1st Prize
Offer highest bonus among all the scholarships in our department.
Outstanding Science and Technology Scholarship | 12 among 256 | 1st Prize
Awarded for outstanding academic performance.
A list of awards I have ever earned.
Challenge Cup National Innovation Contest | 9 among 60,000+ | 1st Author&1st Prize
The top entrepreneurship competition in China; Directly obtain graduate admission privilege to our university for this honor.
The 22th "Fengru Cup" Academic Contest | 15 among 4,600+ | 1st Author&1st Prize
The top academic competition in our university; Directly obtain graduate admission privilege to our university for this honor;
The 21th "Fengru Cup" Innovation Contest | 16 among 4,500+ | 1st Author&1st Prize
The top innovation competition in our university; Directly obtain graduate admission privilege to our university for this honor;
National Information Technology Contest | 89 among 13,000+
1st Author&1st Prize
2016 Fengru Cup Entrepreneurship Contest | 6 among 1100+
1st Author&1st Prize
2016 China Aviation Industry Entrepreneurship Contest | 1 among 1200+
2016 Beihang Electronic Innovation Contest | 1 among 240+
1st Author&1st Prize
2016 Top Comprehensive Performance in the Department | 1 among 256
2015 Beihang Electronic Innovation Contest | 1 among 240+
1st Author&1st Prize
2015 Top Innovation Performance in the Department | 1 among 256
2014 Beihang Electronic Innovation Contest | 1 among 240+
1st Author&1st Prize
2014 Top Innovation Performance in the Department | 1 among 256
MFR: A Somatosensory Equipment Towards Better Gaming Experience | As 1st author.
Move For Real: a really cool project aiming at developing more natural human-computer interface for first-person shooter game. I spend my freshman year in this project. Check it out!
Other Project Experiences
Some of the other projects that involve me. Links available, click to know more.
Michelson Interferometer Automatic Counting System based on Smartphone and Fourier transform | As 2nd author.
We propose an automatic counting software system based on smartphones in the Michelson interferometer experiment. The software comes with a smartphone camera to capture the interference fringe image. Trigonometric cosine Fourier transform is used to analyze the characteristics of the image changes of the interference fringes in the frequency domain. Based on this system, the number of interference fringes can be obtained automatically and directly. The system is simple to operate and has a counting accuracy of up to 0.5 fringe. With the popularity of smartphone applications, the system has broad application prospects.
Social Work Experience
I have also devoted to various social works, listed as below.