相關報道:

  1. Computer Vision After 5 Years - CVPR Workshop

五年後的計算機視覺會是什麼樣?和CV先驅們一同暢想(上) | CVPR2019?

mp.weixin.qq.com圖標

2. 深度學習和計算機視覺領域在 2018 年還有哪些能突破 / 興起或者全新的發展方向?

深度學習和計算機視覺領域在 2018 年還有哪些能突破 / 興起或者全新的發展方向??

www.zhihu.com圖標

3. 深度學習領域有哪些瓶頸?

深度學習領域有哪些瓶頸??

www.zhihu.com圖標


比較「紅」和值得探索的方向,比如,自監督/半監督/弱監督/無監督、少樣本/單樣本/零樣本、可解釋性、對抗樣本/安全、多模態、3D等等~

個人對GAN比較感興趣,CVPR 2020上GAN的論文超110+篇之多,可見其之火。事實上,GAN也在積極向前面那些「紅」方向靠攏,畢竟在數據、靈活性、可控性/可解釋性、更廣泛的應用性上,GAN也急需改善、值得進一步探索,以做到更好。


CVPR 2020 整理到的GAN論文(117篇)涉及的方向(僅供參考,未經允許不得轉載):

  • 1 虛擬試衣、時尚-妝容編輯
  • 2 姿勢轉換、人像合成
  • 3 卡通化、上色、風格遷移
  • 4 文本-圖像生成
  • 5 主動學習
  • 6 強化學習
  • 7 自監督、半監督、無監督學習
  • 8 少樣本、單樣本、零樣本學習
  • 9 圖表示學習
  • 10 域適應-知識遷移
  • 11 遊戲生成
  • 12 醫學圖像
  • 13 多樣性生成
  • 14 圖像恢復、超分去霧去雨
  • 15 圖像修復
  • 16 反射/反光、陰影去除與生成
  • 17 圖像分解、分離
  • 18 注意力機制
  • 19 紋理生成
  • 20 解耦表徵(可控/可解釋性等)
  • 21 聚類
  • 22 語義分割
  • 23 行人重識別
  • 24 GAN結合對抗樣本/攻擊
  • 25 人臉方面(可控生成等等)
  • 26 圖像轉換
  • 27 訓練改進(網路/訓練策略等等)
  • 28 語義生成
  • 29 NAS
  • 30 圖像配准
  • 31 缺失數據處理
  • 32 視頻生成
  • 33 隱私保護
  • 34 模型壓縮
  • 35 3D等其它

詳細版:CVPR 2020最全GAN論文簡介匯總 ;更多GAN閱讀:生活不止有枸杞,還有深度學習GAN和遠方

清單版:

1 虛擬試衣、時尚-妝容編輯

  • Fashion Editing with Adversarial Parsing Learning
  • PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer
  • Image Based Virtual Try-on Network from Unpaired Data
  • Towards Photo-Realistic Virtual Try-On by Adaptively Generating?Preserving Image Content

2 姿勢轉換、人像合成

  • MISC: Multi-condition Injection and Spatially-adaptive Compositing for Conditional Person Image Synthesis
  • Deep Image Spatial Transformation for Person Image Generation
  • Controllable Person Image Synthesis with Attribute-Decomposed GAN
  • Wish You Were Here: Context-Aware Human Generation

3卡通化、上色、風格遷移

  • Learning to Cartoonize Using White-box Cartoon Representations
  • PuppeteerGAN: Arbitrary Portrait Animation with Semantic-aware Appearance Transformation
  • Unpaired Portrait Drawing Generation via Asymmetric Cycle Mapping
  • Reference-Based Sketch Image Colorization using Augmented-Self Reference and Dense Semantic Correspondence
  • Two-Stage Peer-Regularized Feature Recombination for Arbitrary Image Style Transfer

4 文本-圖像生成

  • ManiGAN: Text-Guided Image Manipulation
  • CookGAN: Causality based Text-to-Image Synthesis
  • RiFeGAN: Rich Feature Generation for Text-to-Image Synthesis from Prior Knowledge

5主動學習

  • State-Relabeling Adversarial Active Learning

6 強化學習

  • RL-CycleGAN: Reinforcement Learning Aware Simulation-To-Real

7 自監督、半監督、無監督學習

  • Regularizing Discriminative Capability of CGANs for Semi-Supervised Generative Learning
  • A Self-supervised Approach for Adversarial Robustness
  • Transformation GAN for Unsupervised Image Synthesis and Representation Learning
  • Unsupervised Learning for Intrinsic Image Decomposition from a Single Image
  • Self-Supervised Viewpoint Learning From Image Collections
  • Diverse Image Generation via Self-Conditioned GANs
  • Self-supervised Domain-aware Generative Network for Generalized Zero-shot Learning
  • PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models
  • Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning
  • Self-supervised Domain-aware Generative Network for Generalized Zero-shot Learning

8 少樣本、單樣本、零樣本學習

  • Adversarial Feature Hallucination Networks for Few-Shot Learning
  • One-Shot Domain Adaptation For Face Generation
  • Semi-supervised Learning for Few-shot Image-to-Image Translation
  • Self-supervised Domain-aware Generative Network for Generalized Zero-shot Learning

9 圖表示學習

  • Distribution-induced Bidirectional Generative Adversarial Network for Graph Representation Learning
  • Semantic Image Manipulation Using Scene Graphs

10 域適應、知識遷移

  • MineGAN: effective knowledge transfer from GANs to target domains with few images
  • Progressive Adversarial Networks for Fine-Grained Domain Adaptation
  • Gradually Vanishing Bridge for Adversarial Domain Adaptation
  • Domain Adaptation for Image Dehazing

11 遊戲生成

  • Learning to Simulate Dynamic Environments with GameGAN

12 醫學圖像

  • Synthetic Learning: Learn From Distributed Asynchronized Discriminator GAN Without Sharing Medical Image Data
  • Augmenting Colonoscopy using Extended and Directional CycleGAN for Lossy Image Translation

13 多樣性生成

  • Diverse Image Generation via Self-Conditioned GANs
  • UCTGAN: Diverse Image Inpainting based on Unsupervised Cross-Space Translation
  • MaskGAN: Towards Diverse and Interactive Facial Image Manipulation
  • StarGAN v2: Diverse Image Synthesis for Multiple Domains
  • Semantic Pyramid for Image Generation

14 圖像恢復、超分去霧去雨

  • Learning Invariant Representation for Unsupervised Image Restoration
  • Domain Adaptation for Image Dehazing
  • EventSR: From Asynchronous Events to Image Reconstruction, Restoration, and Super-Resolution via End-to-End Adversarial Learning
  • Learning Texture Transformer Network for Image Super-Resolution
  • PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models
  • Unpaired Image Super-Resolution using Pseudo-Supervision
  • Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting
  • Nested Scale-Editing for Conditional Image Synthesis

15 圖像修復

  • UCTGAN: Diverse Image Inpainting based on Unsupervised Cross-Space Translation
  • Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting
  • Semantic Pyramid for Image Generation

16 反射/反光、陰影去除與生成

  • Learning to Shadow Hand-drawn Sketches
  • Deep Adversarial Decomposition: A Unified Framework for Separating Superimposed Images
  • Single Image Reflection Removal through Cascaded Refinement
  • Single Image Reflection Removal with Physically-Based Training Images
  • Reflection Scene Separation From a Single Image
  • ARShadowGAN: Shadow Generative Adversarial Network for Augmented Reality in Single Light Scenes
  • BEDSR-Net: A Deep Shadow Removal Network from a Single Document Image

17 圖像分解、分離

  • Unsupervised Learning for Intrinsic Image Decomposition from a Single Image
  • Deep Adversarial Decomposition: A Unified Framework for Separating Superimposed Images

18 注意力機制

  • Attentive Normalization for Conditional Image Generation
  • S2A: Wasserstein GAN with Spatio-Spectral Laplacian Attention for Multi-Spectral Band Synthesis
  • Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models
  • End-to-End Adversarial-Attention Network for Multi-Modal Clustering
  • UCTGAN: Diverse Image Inpainting based on Unsupervised Cross-Space Translation
  • VIBE: Video Inference for Human Body Pose and Shape Estimation
  • DOA-GAN: Dual-Order Attentive Generative Adversarial Network for Image Copy-move Forgery Detection and Localization
  • Deep Image Spatial Transformation for Person Image Generation

19 紋理生成

  • Adversarial Texture Optimization from RGB-D Scans

20 解耦表徵

  • Semantic Pyramid for Image Generation
  • MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation
  • Image Processing Using Multi-Code GAN Prior
  • Nested Scale-Editing for Conditional Image Synthesis
  • G3AN: Disentangling Appearance and Motion for Video Generation
  • Neural Head Reenactment with Latent Pose Descriptors
  • Guided Variational Autoencoder for Disentanglement Learning
  • Disentangled Image Generation Through Structured Noise Injection

21 聚類

  • End-to-End Adversarial-Attention Network for Multi-Modal Clustering

22 語義分割

  • Learning Texture Invariant Representation for Domain Adaptation of Semantic Segmentation

23 行人重識別

  • Cross-modality Person re-identification with Shared-Specific Feature Transfer
  • Transferable, Controllable, and Inconspicuous Adversarial Attacks on Person Re-identification With Deep Mis-Ranking

24 GAN結合對抗攻擊/樣本

  • PhysGAN: Generating Physical-World-Resilient Adversarial Examples for Autonomous Driving
  • Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning
  • Adversarial Camouflage: Hiding Physical-World Attacks with Natural Styles
  • A Self-supervised Approach for Adversarial Robustness
  • DaST: Data-free Substitute Training for Adversarial Attacks
  • LG-GAN: Label Guided Adversarial Network for Flexible Targeted Attack of Point Cloud-based Deep Networks
  • Transferable, Controllable, and Inconspicuous Adversarial Attacks on Person Re-identification With Deep Mis-Ranking

25 人臉方面

  • Intuitive, Interactive Beard and Hair Synthesis with Generative Models
  • Disentangled Image Generation Through Structured Noise Injection
  • Adversarial Latent Autoencoders
  • StyleRig: Rigging StyleGAN for 3D Control over Portrait Images
  • Deep 3D Portrait from a Single Image
  • Disentangled and Controllable Face Image Generation via 3D Imitative-Contrastive Learning
  • MaskGAN: Towards Diverse and Interactive Facial Image Manipulation
  • Cascade EF-GAN: Progressive Facial Expression Editing with Local Focuses
  • Interpreting the Latent Space of GANs for Semantic Face Editing
  • One-Shot Domain Adaptation For Face Generation
  • Cross-Spectral Face Hallucination via Disentangling Independent Factors
  • Copy and Paste GAN: Face Hallucination from Shaded Thumbnails
  • The GAN that Warped: Semantic Attribute Editing with Unpaired Data
  • Neural Head Reenactment with Latent Pose Descriptors
  • CIAGAN: Conditional Identity Anonymization Generative Adversarial Networks
  • Rotate-and-Render: Unsupervised Photorealistic Face Rotationfrom Single-View Images

26 圖像轉換

  • Fine-grained Image-to-Image Transformation towards Visual Recognition
  • Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation
  • Domain Adaptive Image-to-image Translation
  • DUNIT: Detection-based Unsupervised Image-to-Image Translation
  • High-Resolution Daytime Translation Without Domain Labels
  • Semi-supervised Learning for Few-shot Image-to-Image Translation
  • Deformation-aware Unpaired Image Translation for Pose Estimation on Laboratory Animals
  • Cross-domain Correspondence Learning for Exemplar-based Image Translation
  • Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation
  • Augmenting Colonoscopy using Extended and Directional CycleGAN for Lossy Image Translation
  • StarGAN v2: Diverse Image Synthesis for Multiple Domains

27 訓練改進

  • Analyzing and Improving the Image Quality of StyleGAN
  • A Characteristic Function Approach to Deep Implicit Generative Modeling
  • Alleviation of Gradient Exploding in GANs: Fake Can Be Real
  • Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are Failing to Reproduce Spectral Distributions
  • A U-Net Based Discriminator for Generative Adversarial Networks
  • MSG-GAN: Multi-Scale Gradients for Generative Adversarial Networks
  • Noise Robust Generative Adversarial Networks
  • On Positive-Unlabeled Classification in GAN
  • AdversarialNAS: Adversarial Neural Architecture Search for GANs
  • Adversarial Latent Autoencoders
  • Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models

28 語義生成

  • SEAN: Image Synthesis with Semantic Region-Adaptive Normalization
  • Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation
  • SketchyCOCO: Image Generation from Freehand Scene Sketches
  • BachGAN: High-Resolution Image Synthesis from Salient Object Layout
  • Panoptic-based Image Synthesis
  • Semantic Image Manipulation Using Scene Graphs
  • Semantically Multi-modal Image Synthesis
  • Wish You Were Here: Context-Aware Human Generation

29 NAS

  • AdversarialNAS: Adversarial Neural Architecture Search for GANs

30 圖像配准

  • Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation

31 缺失數據處理

  • GAMIN: Generative Adversarial Multiple Imputation Network for Highly Missing Data

32 視頻生成

  • G3AN: Disentangling Appearance and Motion for Video Generation

33 隱私保護

  • DOA-GAN: Dual-Order Attentive Generative Adversarial Network for Image Copy-move Forgery Detection and Localization
  • CIAGAN: Conditional Identity Anonymization Generative Adversarial Networks

34 模型壓縮

  • GAN Compression: Efficient Architectures for Interactive Conditional GANs

35 3D等其它

  • StereoGAN: Bridging Synthetic-to-Real Domain Gap by Joint Optimization of Domain Translation and Stereo Matching
  • SharinGAN: Combining Synthetic and Real Data for Unsupervised Geometry Estimation
  • Learning Generative Models of Shape Handles
  • SynSin: End-to-end View Synthesis from a Single Image
  • VIBE: Video Inference for Human Body Pose and Shape Estimation
  • Towards Unsupervised Learning of Generative Models for 3D Controllable Image Synthesis
  • LG-GAN: Label Guided Adversarial Network for Flexible Targeted Attack of Point Cloud-based Deep Networks

更多相關閱讀:

  1. GAN整整6年了!是時候要來捋捋了!
  2. 數百篇GAN論文已下載好!搭配一份生成對抗網路最新綜述!
  3. 新手指南綜述 | GAN模型太多,不知道選哪兒個?
  4. 人臉圖像GAN,今如何?(附多篇論文下載)
  5. 人臉生成新SOTA?
  6. 語義金字塔式-圖像生成:一種使用分類模型特徵的方法
  7. 拆解組新的GAN:解耦表徵MixNMatch
  8. CVPR 2020 | StarGAN第2版:多域多樣性圖像生成
  9. CVPR 2020 | GAN中的反射/反光、陰影
  10. CVPR 2020 | 幾篇GAN語義生成論文
  11. CVPR 2020 | 10篇改進GAN的論文(網路、訓練、正則等)
  12. CVPR 2020 | 11篇GAN圖像轉換img2img 的論文
  13. CVPR2020之MSG-GAN:簡單有效的SOTA?
  14. CVPR2020之姿勢變換GAN
  15. CVPR2020之多碼先驗GAN:預訓練好的模型怎麼使用?
  16. 兩幅圖像!這樣能訓練好 GAN 做圖像轉換嗎?
  17. 2020年5月60篇GAN論文匯總
  18. 最新下載!2020年4月份70多篇GAN論文!
  19. 最新下載!一覽2020年3月至今90多篇GAN論文!
  20. 最新下載!一覽2020年2月50多篇GAN論文!
  21. 一覽!2020年1月份的GANs論文!
  22. 2019年12月份的GANs論文一覽
  23. 這麼多!11月份來的這些GAN論文都在解決什麼方向的問題?

部分應用介紹:

  1. 臉部轉正!GAN能否讓側顏殺手、小豬佩奇真容無處遁形?
  2. 【無中生有的AI】關於deepfake的入門級梳理
  3. 容顏漸失!GAN來預測?
  4. 異常檢測,GAN如何gan?
  5. 虛擬換衣!這幾篇最新論文不來GAN GAN?
  6. 臉部妝容遷移!速覽幾篇用GAN來做的論文
  7. 有點誇張、有點扭曲!速覽GAN如何誇張漫畫化人臉!
  8. 見微知細之超解析度GAN!附70多篇論文下載!
  9. 天降斯雨,於我卻無!GAN用於去雨如何?
  10. 結合GAN的零次學習(zero-shot learning)
  11. 強數據所難!SSL(半監督學習)結合GAN如何?
  12. 弱水三千,只取你標!AL(主動學習)結合GAN如何?
  13. 【1】GAN在醫學圖像上的生成,今如何?



結合計算機視覺、機器人領域5大頂會(CVPR/ICCV/IROS/ICRA/ECCV),以及產業界的需求,總結3個當下熱門及前沿的研究領域。

三維視覺

三維視覺是傳統的研究領域,但最近5年內得到快速發展。三維視覺主要研究內容有:三維感知(點雲獲取及處理)、位姿估計(視覺SLAM)、三維重建(大規模場景的三維重建、動態三維重建)、三維理解(三維物體的識別、檢測及分割等)。

圖 1 3D 視覺的應用場景

3D視覺在CV頂會上的論文比例,也在逐年增加。

圖 2 3D視覺在3大視覺頂會的論文比例

視頻理解

隨著新型網路媒體的出現,以及5G時代的到來,視頻呈現爆炸式增長,已成為移動互聯網最主要的內容形式。面對於海量的視頻信息,僅靠人工處理是無法完成的,因此實現視頻的智能化理解則成為了亟待解決的問題。

自2012年,深度學習在圖像理解的問題上取得了較大的突破,但視覺理解比圖像的目標檢測識別要複雜的多。這是因為視頻常有許多動作,動作往往是一個複雜概念的集合,可以是簡單的行為,但也可能是帶有複雜的情緒、意圖。舉個簡單的例子,對一段視頻分類,與對一幅圖像分類,哪個更容易一些?

從最近幾年知名的計算機視覺競賽,也可以看出,圖像層面的競賽在減少,視頻層面的競賽在增加。

多模態融合

多模態融合的知識獲取是指從文本、圖片、視頻、音頻等不同模態數據中交叉融合獲取知識的過程。

隨著計算機視覺越來越成熟之後,有一些計算機視覺解決不了的問題慢慢就會更多地依賴於多個感測器之間的相互保護和融合。

小結

怕什麼真理無窮,進一寸有一寸的歡喜!


深藍學院(https://www.shenlanxueyuan.com/)是專註於人工智慧的在線教育平台,致力於構建前沿科技課程培養體系的業界標準,涵蓋人工智慧基礎、機器學習、計算機視覺、自然語言處理、智能機器人等領域。


謝邀,個人覺得以下幾個方向很有價值

  1. automl
  2. vision + language
  3. transfer learning (domain adaptation, zeroshot, fewshot etc.)


個人感覺視覺和語言的聯合還是很有前途的,雖然說目前坑比較多,但是萬一那一天有了突破就是飛速的發展。

他還舉了個例子,一團煙霧的行為可以用公式描述,但是一顆樹的生長行為是由溫度、光照、水、氣候等等許多複雜的因素在很長時間內連續變化所影響的,那麼它就是沒辦法簡單地解釋的。

舉的這個例子並不恰當,反而恰恰說明了神經網路模型解釋性的迫切性。一棵樹生長的影響因素雖然說是有很多,但是至少我們明確知道控制變數條件下,恰當的溫度、恰當的水分等條件可以明確引導樹向生長較好的方向生長。

然後對於一個神經網路模型而言,無論是哪個任務,我們都很難說控制只控制某一種因素就可以往預期的方向發展。


推薦閱讀:
相关文章