To learn image super-resolution, use a GAN to learn how to do image degradation first | |||
Feature Perceptual Loss for Variational Autoencoder | • Autoencoder • Loss function |
||
Context Encoder | Context Encoders: Feature Learning by Inpainting | • Self-supervised learning • Visual representation learning • Image inpainting |
|
Fixing the train•test resolution discrepancy | |||
GANs | Generative Adversarial Nets | • GANs | |
ImageGPT | Generative Pretraining from Pixels | • Self-supervised learning • Visual representation learning |
|
Deformable ConvNets v2: More Deformable, Better Results | • CNN | ||
Deformable Convolutional Networks | • CNN | ||
ControlNet | Adding Conditional Control to Text-to-Image Diffusion Models | • Transformer • Diffusion |
|
BEIT | BEIT: BERT Pre•Training of Image Transformers | • Self-supervised learning • Visual representation learning |
|
Diffusion Illusion | Diffusion Illusions: Hiding Images in Plain Sight | • Diffusion • Illusion |
|
LVDM | Latent Video Diffusion Models for High•Fidelity Long Video Generation | • VSR • Diffusion |
|
Understanding Deformable Alignment in Video Super-Resolution | • VSR • Deformable convolution |
||
Towards Accurate Generative Models of Video: A New Metric & Challenges | • Metric | ||
VQ•VAE•2 | Generating Diverse High•Fidelity Images with VQ•VAE•2 | • Image generation • GANs |
|
VQGAN | Taming Transformers for High•Resolution Image Synthesis | • Image generation • GANs |
|
CDM | Cascaded Diffusion Models for High Fidelity Image Generation | • Image generation | |
Consistency Models | • Image generation | ||
DiffiT | DiffiT: Diffusion Vision Transformers for Image Generation | • Image generation • Transformer |
|
Emu | Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack | • Image generation | |
Image Super-resolution Via Latent Diffusion: A Sampling•space Mixture Of Experts And Frequency-augmented ecoder Approach | • ISR | ||
Video LDM | Align your Latents: High•Resolution Video Synthesis with Latent Diffusion Models | • Video generation | |
SwinIR: Image Restoration Using Swin Transformer | • ISR • Transformer |
||
Blind Super-Resolution Kernel Estimation using an Internal•GAN | • ISR • GANs |
||
BasicVSR | BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond | • VSR | |
BasicVSR++ | BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment | • VSR | |
SR3 | Image Super-Resolution via Iterative Refinement | • ISR • Diffusion |
|
SR3+ | Denoising Diffusion Probabilistic Models for Robust Image Super-Resolution in the Wild | • ISR • Diffusion |
|
Designing a Practical Degradation Model for Deep Blind Image Super-Resolution | • BISR | ||
DiffBIR | DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior | • BISR | |
MoESR | Image Super-resolution Via Latent Diffusion: A Sampling•space Mixture Of Experts And Frequency-augmented Decoder Approach | • ISR • Diffusion |
|
LIIF | Learning Continuous Image Representation with Local Implicit Image Function | • Continuous super-resolution | |
Implicit Diffusion Models for Continuous Super-Resolution | • Continuous super-resolution | ||
Arbitrary•Scale Image Generation and Upsampling using Latent Diffusion | |||
Model and Implicit Neural Decoder | • Continuous super-resolution | ||
Flamingo | Flamingo: a Visual Language Model for Few-Shot Learning | • Transformer | |
VideoGPT | VideoGPT: Video Generation using VQ•VAE and Transformers | ||
Dall-E 3 | Improving Image Generation with Better Captions | • Text-to-image generation | |
FIFO | FIFO•Diffusion: Generating Infinite Videos from Text without Training | • Text-to-image generation | |
Self-supervised Pre-training of Text Recognizers | • Self-supervised learning • Text spotting |
||
Text-DIAE | Text-DIAE: A Self-Supervised Degradation Invariant Autoencoder for Text Recognition and Document Enhancement | • Self-supervised learning • Text spotting |
|
FixMatch | FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence | • Semi-supervised learning | |
Donut | OCR-free Document Understanding Transformer | • Transformer • VDU |
|
ViTLP | Visually Guided Generative Text-Layout Pre-training for Document Intelligence | • Transformer • Text spotting • VDU |
|
Webvicob | On Web-based Visual Corpus Construction for Visual Document Understanding | • Data collection • Text spotting • VDU |
|
SynthText | Synthetic Data for Text Localisation in Natural Images | • Synthetic data generation • Text spotting |
|
SynthTIGER | SynthTIGER: Synthetic Text Image GEneratoR Towards Better Text Recognition Models | • Synthetic data generation • Text spotting |
|
UnrealText | UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World | • Synthetic data generation • Text spotting |
|
SynthText3D | SynthText3D: Synthesizing Scene Text Images from 3D Virtual Worlds | • Synthetic data generation • Text spotting |
|
VISD | Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes | • Synthetic data • Text spotting |
|
LaMa | Resolution-robust Large Mask Inpainting with Fourier Convolutions | • Image inpainting |
-
Notifications
You must be signed in to change notification settings - Fork 0
KimRass/AI-Papers
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
AI paper reviews in Korean
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published