GitHub - KimRass/AI-Papers: AI paper reviews in Korean


	To learn image super-resolution, use a GAN to learn how to do image degradation first
	Feature Perceptual Loss for Variational Autoencoder	• Autoencoder • Loss function
Context Encoder	Context Encoders: Feature Learning by Inpainting	• Self-supervised learning • Visual representation learning • Image inpainting
	Fixing the train•test resolution discrepancy
GANs	Generative Adversarial Nets	• GANs
ImageGPT	Generative Pretraining from Pixels	• Self-supervised learning • Visual representation learning
	Deformable ConvNets v2: More Deformable, Better Results	• CNN
	Deformable Convolutional Networks	• CNN
ControlNet	Adding Conditional Control to Text-to-Image Diffusion Models	• Transformer • Diffusion
BEIT	BEIT: BERT Pre•Training of Image Transformers	• Self-supervised learning • Visual representation learning
Diffusion Illusion	Diffusion Illusions: Hiding Images in Plain Sight	• Diffusion • Illusion
LVDM	Latent Video Diffusion Models for High•Fidelity Long Video Generation	• VSR • Diffusion
	Understanding Deformable Alignment in Video Super-Resolution	• VSR • Deformable convolution
	Towards Accurate Generative Models of Video: A New Metric & Challenges	• Metric
VQ•VAE•2	Generating Diverse High•Fidelity Images with VQ•VAE•2	• Image generation • GANs
VQGAN	Taming Transformers for High•Resolution Image Synthesis	• Image generation • GANs
CDM	Cascaded Diffusion Models for High Fidelity Image Generation	• Image generation
	Consistency Models	• Image generation
DiffiT	DiffiT: Diffusion Vision Transformers for Image Generation	• Image generation • Transformer
Emu	Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack	• Image generation
	Image Super-resolution Via Latent Diffusion: A Sampling•space Mixture Of Experts And Frequency-augmented ecoder Approach	• ISR
Video LDM	Align your Latents: High•Resolution Video Synthesis with Latent Diffusion Models	• Video generation
SwinIR: Image Restoration Using Swin Transformer	• ISR • Transformer
	Blind Super-Resolution Kernel Estimation using an Internal•GAN	• ISR • GANs
BasicVSR	BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond	• VSR
BasicVSR++	BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment	• VSR
SR3	Image Super-Resolution via Iterative Refinement	• ISR • Diffusion
SR3+	Denoising Diffusion Probabilistic Models for Robust Image Super-Resolution in the Wild	• ISR • Diffusion
	Designing a Practical Degradation Model for Deep Blind Image Super-Resolution	• BISR
DiffBIR	DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior	• BISR
MoESR	Image Super-resolution Via Latent Diffusion: A Sampling•space Mixture Of Experts And Frequency-augmented Decoder Approach	• ISR • Diffusion
LIIF	Learning Continuous Image Representation with Local Implicit Image Function	• Continuous super-resolution
	Implicit Diffusion Models for Continuous Super-Resolution	• Continuous super-resolution
	Arbitrary•Scale Image Generation and Upsampling using Latent Diffusion
	Model and Implicit Neural Decoder	• Continuous super-resolution
Flamingo	Flamingo: a Visual Language Model for Few-Shot Learning	• Transformer
VideoGPT	VideoGPT: Video Generation using VQ•VAE and Transformers
Dall-E 3	Improving Image Generation with Better Captions	• Text-to-image generation
FIFO	FIFO•Diffusion: Generating Infinite Videos from Text without Training	• Text-to-image generation
	Self-supervised Pre-training of Text Recognizers	• Self-supervised learning • Text spotting
Text-DIAE	Text-DIAE: A Self-Supervised Degradation Invariant Autoencoder for Text Recognition and Document Enhancement	• Self-supervised learning • Text spotting
FixMatch	FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence	• Semi-supervised learning
Donut	OCR-free Document Understanding Transformer	• Transformer • VDU
ViTLP	Visually Guided Generative Text-Layout Pre-training for Document Intelligence	• Transformer • Text spotting • VDU
Webvicob	On Web-based Visual Corpus Construction for Visual Document Understanding	• Data collection • Text spotting • VDU
SynthText	Synthetic Data for Text Localisation in Natural Images	• Synthetic data generation • Text spotting
SynthTIGER	SynthTIGER: Synthetic Text Image GEneratoR Towards Better Text Recognition Models	• Synthetic data generation • Text spotting
UnrealText	UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World	• Synthetic data generation • Text spotting
SynthText3D	SynthText3D: Synthesizing Scene Text Images from 3D Virtual Worlds	• Synthetic data generation • Text spotting
VISD	Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes	• Synthetic data • Text spotting
LaMa	Resolution-robust Large Mask Inpainting with Fourier Convolutions	• Image inpainting

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 106 Commits
.gitignore		.gitignore
3d_gaussian_splatting_for_real_time_radiance_field_rendering.pdf		3d_gaussian_splatting_for_real_time_radiance_field_rendering.pdf
README.md		README.md
a_comprehensive_overview_of_large_language_models.pdf		a_comprehensive_overview_of_large_language_models.pdf
a_simple_single_scale_vision_transformer_for_object_detection_and_instance_segmentation.pdf		a_simple_single_scale_vision_transformer_for_object_detection_and_instance_segmentation.pdf
a_survey_of_large_language_models.pdf		a_survey_of_large_language_models.pdf
a_technical_report_for_polyglot_ko_open_source_large_scale_korean_language_models.pdf		a_technical_report_for_polyglot_ko_open_source_large_scale_korean_language_models.pdf
accurate_large_minibatch_sgd_training_imagenet_in_1_hour.pdf		accurate_large_minibatch_sgd_training_imagenet_in_1_hour.pdf
adaptive_budget_allocation_for_parameter_efficient_fine_tuning.pdf		adaptive_budget_allocation_for_parameter_efficient_fine_tuning.pdf
align_before_fuse_vision_and_language_representation_learning_with_momentum_distillation.pdf		align_before_fuse_vision_and_language_representation_learning_with_momentum_distillation.pdf
align_your_latents_high_resolution_video_synthesis_with_latent_diffusion_models.pdf		align_your_latents_high_resolution_video_synthesis_with_latent_diffusion_models.pdf
all_are_worth_words_a_vit_backbone_for_diffusion_models.pdf		all_are_worth_words_a_vit_backbone_for_diffusion_models.pdf
an_image_based_virtual_try_on_network.pdf		an_image_based_virtual_try_on_network.pdf
animatediff_animate_your_personalized_text_to_image_diffusion_models_without_specific_tuning.pdf		animatediff_animate_your_personalized_text_to_image_diffusion_models_without_specific_tuning.pdf
anls_a_universal_document_processing_metric_for_generative_large_language_models.pdf		anls_a_universal_document_processing_metric_for_generative_large_language_models.pdf
arbitrary_scale_image_generation_and_upsampling_using_latent_diffusion_model_and_implicit_neural_decoder.pdf		arbitrary_scale_image_generation_and_upsampling_using_latent_diffusion_model_and_implicit_neural_decoder.pdf
augment_your_batch_improving_generalization_through_instance_repetition.pdf		augment_your_batch_improving_generalization_through_instance_repetition.pdf
autoaugment_earning_augmentation_policies_from_data.pdf		autoaugment_earning_augmentation_policies_from_data.pdf
bart_denoising_sequence_to_sequence_pre_training_for_natural_language_generation_translation_and_comprehension.pdf		bart_denoising_sequence_to_sequence_pre_training_for_natural_language_generation_translation_and_comprehension.pdf
basicvsr++_improving_video_super_resolution_with_enhanced_propagation_and_alignment.pdf		basicvsr++_improving_video_super_resolution_with_enhanced_propagation_and_alignment.pdf
basicvsr++_improving_video_super_resolution_with_enhanced_propagation_and_alignment_supplementary_material.pdf		basicvsr++_improving_video_super_resolution_with_enhanced_propagation_and_alignment_supplementary_material.pdf
basicvsr_the_search_for_essential_components_in_video_super_resolution_and_beyond.pdf		basicvsr_the_search_for_essential_components_in_video_super_resolution_and_beyond.pdf
beit_bert_pretraining_of_image_transformers.pdf		beit_bert_pretraining_of_image_transformers.pdf
big_transfer_bit_general_visual_representation_learning.pdf		big_transfer_bit_general_visual_representation_learning.pdf
blind_super_resolution_kernel_estimation_using_an_internal_gan.pdf		blind_super_resolution_kernel_estimation_using_an_internal_gan.pdf
blip_bootstrapping_language_image_pretrainingfor_unified_vision_language_understanding_and_generation.pdf		blip_bootstrapping_language_image_pretrainingfor_unified_vision_language_understanding_and_generation.pdf
bridging_the_gap_between_end_to_end_and_two_step_text_spotting.pdf		bridging_the_gap_between_end_to_end_and_two_step_text_spotting.pdf
cascaded_diffusion_models_for_high_fidelity_image_generation.pdf		cascaded_diffusion_models_for_high_fidelity_image_generation.pdf
catlip_clip_level_visual_recognition_accuracy_with_faster_pre_training_on_web_scale_image_text_data.pdf		catlip_clip_level_visual_recognition_accuracy_with_faster_pre_training_on_web_scale_image_text_data.pdf
character_region_attention_for_text_spotting.pdf		character_region_attention_for_text_spotting.pdf
cleval_character_level_evaluation_for_text_detection_and_recognition_tasks.pdf		cleval_character_level_evaluation_for_text_detection_and_recognition_tasks.pdf
coca_contrastive_captioners_are_image_text_foundation_models.pdf		coca_contrastive_captioners_are_image_text_foundation_models.pdf
common_diffusion_noise_schedules_and_sample_steps_are_flawed.pdf		common_diffusion_noise_schedules_and_sample_steps_are_flawed.pdf
context_encoders_feature_learning_by_inpainting.pdf		context_encoders_feature_learning_by_inpainting.pdf
contrastive_learning_for_unpaired_image_to_image_translation.pdf		contrastive_learning_for_unpaired_image_to_image_translation.pdf
cv_vae_a_compatible_video_vae_for_latent_generative_video_models.pdf		cv_vae_a_compatible_video_vae_for_latent_generative_video_models.pdf
deberta_decoding_enhanced_bert_with_disentangled_attention.pdf		deberta_decoding_enhanced_bert_with_disentangled_attention.pdf
deer_detection_agnostic_end_to_end_recognizer_for_scene_text_spotting.pdf		deer_detection_agnostic_end_to_end_recognizer_for_scene_text_spotting.pdf
deformable_convnets_v2_more_deformable_better_results.pdf		deformable_convnets_v2_more_deformable_better_results.pdf
deformable_convolutional_networks.pdf		deformable_convolutional_networks.pdf
deformable_non_local_network_for_video_super_resolution.pdf		deformable_non_local_network_for_video_super_resolution.pdf
denoising_diffusion_probabilistic_models_for_robust_image_super_resolution_in_the_wild.pdf		denoising_diffusion_probabilistic_models_for_robust_image_super_resolution_in_the_wild.pdf
designing_a_practical_degradation_model_for_deep_blind_image_super_resolution.pdf		designing_a_practical_degradation_model_for_deep_blind_image_super_resolution.pdf
diffbir_towards_blind_image_restoration_with_generative_diffusion_prior.pdf		diffbir_towards_blind_image_restoration_with_generative_diffusion_prior.pdf
diffit_diffusion_vision_transformers_for_image_generation.pdf		diffit_diffusion_vision_transformers_for_image_generation.pdf
diffuse_to_choose_enriching_image_conditioned_inpainting_in_latent_diffusion_models_for_virtual_try_all.pdf		diffuse_to_choose_enriching_image_conditioned_inpainting_in_latent_diffusion_models_for_virtual_try_all.pdf
diffusion_autoencoders_toward_a_meaningful_and_decodable_representation.pdf		diffusion_autoencoders_toward_a_meaningful_and_decodable_representation.pdf
diffusion_illusions_hiding_images_in_plain_sight.pdf		diffusion_illusions_hiding_images_in_plain_sight.pdf
diffusion_rwkv_scaling_rwkv_like_architectures_for_diffusion_models.pdf		diffusion_rwkv_scaling_rwkv_like_architectures_for_diffusion_models.pdf
dinov2_learning_robust_visual_features_without_supervision.pdf		dinov2_learning_robust_visual_features_without_supervision.pdf
distilbert_a_distilled_version_of_bert_smaller_faster_cheaper_and_lighter.pdf		distilbert_a_distilled_version_of_bert_smaller_faster_cheaper_and_lighter.pdf
do_vision_transformers_see_like_convolutional_neural_networks.pdf		do_vision_transformers_see_like_convolutional_neural_networks.pdf
dont_decay_the_learning_rate_increase_the_batch_size.pdf		dont_decay_the_learning_rate_increase_the_batch_size.pdf
dreamfusion_text_to_d_using_d_diffusion.pdf		dreamfusion_text_to_d_using_d_diffusion.pdf
east_an_efficient_and_accurate_scene_text_detector.pdf		east_an_efficient_and_accurate_scene_text_detector.pdf
easyanimate_a_high_performance_long_video_generation_method_based_on_transformer_architecture.pdf		easyanimate_a_high_performance_long_video_generation_method_based_on_transformer_architecture.pdf
edvr_video_restoration_with_enhanced_deformable_convolutional_networks.pdf		edvr_video_restoration_with_enhanced_deformable_convolutional_networks.pdf
efficientnet_rethinking_model_scaling_for_convolutional_neural_networks.pdf		efficientnet_rethinking_model_scaling_for_convolutional_neural_networks.pdf
electra_pre_training_text_encoders_as_discriminators_rather_than_generators.pdf		electra_pre_training_text_encoders_as_discriminators_rather_than_generators.pdf
elevater_a_benchmark_and_toolkit_for_evaluating_language_augmented_visual_models.pdf		elevater_a_benchmark_and_toolkit_for_evaluating_language_augmented_visual_models.pdf
elucidating_the_design_space_of_diffusion_based_generative_models.pdf		elucidating_the_design_space_of_diffusion_based_generative_models.pdf
emerging_properties_in_self_supervised_vision_transformers.pdf		emerging_properties_in_self_supervised_vision_transformers.pdf
emu_enhancing_image_generation_models_using_photogenic_needles_in_a_haystack.pdf		emu_enhancing_image_generation_models_using_photogenic_needles_in_a_haystack.pdf
enhancing_scene_text_detectors_with_realistic_text_image_synthesis_using_diffusion.pdf		enhancing_scene_text_detectors_with_realistic_text_image_synthesis_using_diffusion.pdf
eva_clip_improved_training_techniques_for_clip_at_scale.pdf		eva_clip_improved_training_techniques_for_clip_at_scale.pdf
exploiting_diffusion_prior_for_real_world_image_super_resolution.pdf		exploiting_diffusion_prior_for_real_world_image_super_resolution.pdf
exploring_plain_vision_transformer_backbones_for_object_detection.pdf		exploring_plain_vision_transformer_backbones_for_object_detection.pdf
exploring_the_limits_of_transfer_learning_with_a_unified_text_to_text_transformer.pdf		exploring_the_limits_of_transfer_learning_with_a_unified_text_to_text_transformer.pdf
fast_faster_arbitrarily_shaped_text_detector_with_minimalist_kernel_representation.pdf		fast_faster_arbitrarily_shaped_text_detector_with_minimalist_kernel_representation.pdf
fast_rcnn.pdf		fast_rcnn.pdf
feature_perceptual_loss_for_variational_autoencoder.pdf		feature_perceptual_loss_for_variational_autoencoder.pdf
fifo_diffusion_generating_infinite_videos_from_text_without_training.pdf		fifo_diffusion_generating_infinite_videos_from_text_without_training.pdf
fix_the_noise_disentangling_source_feature_for_controllable_domain_translation.pdf		fix_the_noise_disentangling_source_feature_for_controllable_domain_translation.pdf
fixing_the_train_test_resolution_discrepancy.pdf		fixing_the_train_test_resolution_discrepancy.pdf
fixmatch_simplifying_semi_supervised_learning_with_consistency_and_confidence.pdf		fixmatch_simplifying_semi_supervised_learning_with_consistency_and_confidence.pdf
flamingo_a_visual_language_model_for_few_shot_learning.pdf		flamingo_a_visual_language_model_for_few_shot_learning.pdf
flashattention_fast_and_memory_efficient_exact_attention_with_io_awareness.pdf		flashattention_fast_and_memory_efficient_exact_attention_with_io_awareness.pdf
florence_a_new_foundation_model_for_computer_vision.pdf		florence_a_new_foundation_model_for_computer_vision.pdf
gans.md		gans.md
generating_diverse_high_fidelity_images_with_vq_vae_2.pdf		generating_diverse_high_fidelity_images_with_vq_vae_2.pdf
generating_long_sequences_with_sparse_transformers.pdf		generating_long_sequences_with_sparse_transformers.pdf
generative_adversarial_nets.pdf		generative_adversarial_nets.pdf
generative_pretraining_from_pixels.pdf		generative_pretraining_from_pixels.pdf
glamm_pixel_grounding_large_multimodal_model.pdf		glamm_pixel_grounding_large_multimodal_model.pdf
glide_towards_photorealistic_image_generation_and_editing_with_text_guided_diffusion_models.pdf		glide_towards_photorealistic_image_generation_and_editing_with_text_guided_diffusion_models.pdf
gpt_understands_too.pdf		gpt_understands_too.pdf
groupvit_semantic_segmentation_emerges_from_text_supervision.pdf		groupvit_semantic_segmentation_emerges_from_text_supervision.pdf
high_performance_large_scale_image_recognition_without_normalization.pdf		high_performance_large_scale_image_recognition_without_normalization.pdf
hubert_self_supervised_speech_representation_learning_by_masked_prediction_of_hidden_units.pdf		hubert_self_supervised_speech_representation_learning_by_masked_prediction_of_hidden_units.pdf
image_inpainting_with_cascaded_modulation_gan_and_object_aware_training.pdf		image_inpainting_with_cascaded_modulation_gan_and_object_aware_training.pdf
image_super_resolution_via_iterative_refinement.pdf		image_super_resolution_via_iterative_refinement.pdf
image_super_resolution_via_latent_diffusion_a_sampling_space_mixture_of_experts_and_frequency_augmented_decoder_approach.pdf		image_super_resolution_via_latent_diffusion_a_sampling_space_mixture_of_experts_and_frequency_augmented_decoder_approach.pdf
image_transformer.pdf		image_transformer.pdf
immiscible_diffusion_accelerating_diffusion_training_with_noise_assignment.pdf		immiscible_diffusion_accelerating_diffusion_training_with_noise_assignment.pdf
implicit_diffusion_models_for_continuous_super_resolution.pdf		implicit_diffusion_models_for_continuous_super_resolution.pdf
improved_precision_and_recall_metric_for_assessing_generative_models.pdf		improved_precision_and_recall_metric_for_assessing_generative_models.pdf
improved_techniques_for_training_gans.pdf		improved_techniques_for_training_gans.pdf
improving_image_generation_with_better_captions.pdf		improving_image_generation_with_better_captions.pdf
improving_language_understanding_by_generative_pretraining.pdf		improving_language_understanding_by_generative_pretraining.pdf
instantfamily_masked_attention_for_zero_shot_multi_id_image_generation.pdf		instantfamily_masked_attention_for_zero_shot_multi_id_image_generation.pdf

KimRass/AI-Papers

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages