My Arxiv Daily

Author:baiyucraft

BLog: baiyucraft’s Home


Updated on 2023.10.25 09:31

MOT

Publish DateTitleAuthorsarxivPDFCode
2023-10-23Achromatic, planar Fresnel-Reflector for a Single-beam Magneto-optical TrapSaskia Bondza et.al.2310.14861:mortar_board:None
2023-10-23Label Space Partition Selection for Multi-Object Tracking Using Two-Layer PartitioningJi Youn Lee et.al.2310.14506:mortar_board:None
2023-10-23Player Re-Identification Using Body Part AppearencesMahesh Bhosale et.al.2310.14469:mortar_board:None
2023-10-22Neural Text Sanitization with Privacy Risk Indicators: An Empirical AnalysisAnthi Papadopoulou et.al.2310.14312:mortar_board:None
2023-10-22Deep MDP: A Modular Framework for Multi-Object TrackingAbhineet Singh et.al.2310.14294:mortar_board:Code
2023-10-20EarlyBird: Early-Fusion for Multi-View Tracking in the Bird’s Eye ViewTorben Teepe et.al.2310.13350:mortar_board:Code
2023-10-19Deep Learning Techniques for Video Instance Segmentation: A SurveyChenhao Xu et.al.2310.12393:mortar_board:None
2023-10-18Runner re-identification from single-view video in the open-world settingTomohiro Suzuki et.al.2310.11700:mortar_board:None
2023-10-17Learning Comprehensive Representations with Richer Self for Text-to-Image Person Re-IdentificationShuanglin Yan et.al.2310.11210:mortar_board:None
2023-10-13Pairwise Similarity Learning is SimPLEYandong Wen et.al.2310.09449:mortar_board:None
2023-10-12Progress towards ultracold Sr for the AION project – sub-microkelvin atoms and an optical-heterodyne diagnostic tool for injection-locked laser diodesE. Pasatembou et.al.2310.08500:mortar_board:None
2023-10-12Beyond Sharing Weights in Decoupling Feature Learning Network for UAV RGB-Infrared Vehicle Re-IdentificationXingyue Liu et.al.2310.08026:mortar_board:None
2023-10-11ProtoHPE: Prototype-guided High-frequency Patch Enhancement for Visible-Infrared Person Re-identificationGuiwei Zhang et.al.2310.07552:mortar_board:None
2023-10-10Automatic nodule identification and differentiation in ultrasound videos to facilitate per-nodule examinationSiyuan Jiang et.al.2310.06339:mortar_board:None
2023-10-09Joint object detection and re-identification for 3D obstacle multi-camera systemsIrene Cortés et.al.2310.05785:mortar_board:None
2023-10-08Multi-Ship Tracking by Robust Similarity metricHongyu Zhao et.al.2310.05171:mortar_board:None
2023-10-07Comparative study of multi-person tracking methodsDenis Mbey Akola et.al.2310.04825:mortar_board:None
2023-10-06Alice Benchmarks: Connecting Real World Object Re-Identification with the SyntheticXiaoxiao Sun et.al.2310.04416:mortar_board:None
2023-10-06VI-Diff: Unpaired Visible-Infrared Translation Diffusion Model for Single Modality Labeled Visible-Infrared Person Re-identificationHan Huang et.al.2310.04122:mortar_board:None
2023-10-04COOLer: Class-Incremental Learning for Appearance-Based Multiple Object TrackingZhizheng Liu et.al.2310.03006:mortar_board:Code
2023-10-04ShaSTA-Fuse: Camera-LiDAR Sensor Fusion to Model Shape and Spatio-Temporal Affinities for 3D Multi-Object TrackingTara Sadjadpour et.al.2310.02532:mortar_board:None
2023-10-03DARTH: Holistic Test-time Adaptation for Multiple Object TrackingMattia Segu et.al.2310.01926:mortar_board:Code
2023-10-02Offline Tracking with Object PermanenceXianzhong Liu et.al.2310.01288:mortar_board:None
2023-10-02Strength in Diversity: Multi-Branch Representation Learning for Vehicle Re-IdentificationEurico Almeida et.al.2310.01129:mortar_board:Code
2023-10-02LoCUS: Learning Multiscale 3D-consistent Features from Posed ImagesDominik A. Kloepfer et.al.2310.01095:mortar_board:None
2023-09-30Magneto-optical trap reaction microscope for photoioization of cold strontium atomsShushu Ruan et.al.2310.00389:mortar_board:None
2023-09-30Walking = Traversable? : Traversability Prediction via Multiple Human Object Tracking under OcclusionJonathan Tay Yu Liang et.al.2310.00242:mortar_board:None
2023-09-29Prototype-guided Cross-modal Completion and Alignment for Incomplete Text-based Person Re-identificationTiantian Gong et.al.2309.17104:mortar_board:None
2023-09-29SpikeMOT: Event-based Multi-Object Tracking with Sparse Motion FeaturesSong Wang et.al.2309.16987:mortar_board:None
2023-09-28Hyperfine structure of the A1Π\mathbf{A^{1}Π} state of AlCl and its relevance to laser cooling and trappingJ. R. Daniel et.al.2309.16835:mortar_board:None
2023-09-27AaP-ReID: Improved Attention-Aware Person Re-identificationVipin Gautam et.al.2309.15780:mortar_board:None
2023-09-273D Multiple Object Tracking on Autonomous Driving: A Literature ReviewPeng Zhang et.al.2309.15411:mortar_board:None
2023-09-26A Quantitative Information Flow Analysis of the Topics APIMário S. Alvim et.al.2309.14746:mortar_board:None
2023-09-25Magneto-optical trap performance for high-bandwidth applicationsBenjamin Adams et.al.2309.14026:mortar_board:None
2023-09-24Combining Two Adversarial Attacks Against Person Re-Identification SystemsEduardo de O. Andrade et.al.2309.13763:mortar_board:None
2023-09-24Towards Robust Robot 3D Perception in Urban Environments: The UT Campus Object DatasetArthur Zhang et.al.2309.13549:mortar_board:Code
2023-09-23AgriSORT: A Simple Online Real-time Tracking-by-Detection framework for robotics in precision agricultureLeonardo Saraceni et.al.2309.13393:mortar_board:None
2023-09-23YOLORe-IDNet: An Efficient Multi-Camera System for Person-TrackingVipin Gautam et.al.2309.13387:mortar_board:None
2023-09-21Human Following in Mobile Platforms with Person Re-IdentificationMario Srouji et.al.2309.12479:mortar_board:None
2023-09-21DIOR: Dataset for Indoor-Outdoor Reidentification – Long Range 3D/2D Skeleton Gait Collection Pipeline, Semi-Automated Gait Keypoint Labeling and Baseline Evaluation MethodsYuyang Chen et.al.2309.12429:mortar_board:None
2023-09-21BASE: Probably a Better Approach to Multi-Object TrackingMartin Vonheim Larsen et.al.2309.12035:mortar_board:None
2023-09-21Person Re-Identification for Robot Person Following with Online Continual LearningHanjing Ye et.al.2309.11727:mortar_board:None
2023-09-20PSDiff: Diffusion Model for Person Search with Iterative and Collaborative RefinementChengyou Jia et.al.2309.11125:mortar_board:None
2023-09-19OccluTrack: Rethinking Awareness of Occlusion for Enhancing Multiple Pedestrian TrackingJianjun Gao et.al.2309.10360:mortar_board:None
2023-09-18Localization-Guided Track: A Deep Association Multi-Object Tracking Framework Based on Localization Confidence of DetectionsTing Meng et.al.2309.09765:mortar_board:Code
2023-09-18Moving Object Detection and Tracking with 4D Radar Point CloudZhijun Pan et.al.2309.09737:mortar_board:None
2023-09-15Beyond Domain Gap: Exploiting Subjectivity in Sketch-Based Person RetrievalKejun Lin et.al.2309.08372:mortar_board:Code
2023-09-13Tracking Particles Ejected From Active Asteroid Bennu With Event-Based VisionLoïc J. Azzalini et.al.2309.06819:mortar_board:None
2023-09-12The Influence of Contrast and Temporal Expansion on the Marching-on-in-Time Contrast Current Density Volume Integral EquationPetrus W. N. van Diepen et.al.2309.06321:mortar_board:None
2023-09-12Modality Unifying Network for Visible-Infrared Person Re-IdentificationHao Yu et.al.2309.06262:mortar_board:None
2023-09-12Which Framework is Suitable for Online 3D Multi-Object Tracking for Autonomous Driving with Automotive 4D Imaging Radar?Jianan Liu et.al.2309.06036:mortar_board:None
2023-09-12SoccerNet 2023 Challenges ResultsAnthony Cioppa et.al.2309.06006:mortar_board:Code
2023-09-09DeNoising-MOT: Towards Multiple Object Tracking with Severe OcclusionsTeng Fu et.al.2309.04682:mortar_board:None
2023-09-09BiLMa: Bidirectional Local-Matching for Text-based Person Re-identificationTakuro Fujii et.al.2309.04675:mortar_board:None
2023-09-07Region Generation and Assessment Network for Occluded Person Re-IdentificationShuting He et.al.2309.03558:mortar_board:None
2023-09-07Genericity of singularities in spacetimes with weakly trapped submanifoldsIvan Pontual Costa e Silva et.al.2309.03421:mortar_board:None
2023-09-06FishMOT: A Simple and Effective Method for Fish Tracking Based on IoU MatchingShuo Liu et.al.2309.02975:mortar_board:Code
2023-09-06Fast and Resource-Efficient Object Tracking on Edge Devices: A Measurement StudySanjana Vijay Ganesh et.al.2309.02666:mortar_board:None
2023-09-04Unified Pre-training with Pseudo Texts for Text-To-Image Person Re-identificationZhiyin Shao et.al.2309.01420:mortar_board:None
2023-09-03Spatial-temporal Vehicle Re-identificationHye-Geun Kim et.al.2309.01166:mortar_board:None
2023-09-03UnsMOT: Unified Framework for Unsupervised Multi-Object Tracking with Geometric Topology GuidanceSon Tran et.al.2309.01078:mortar_board:None
2023-09-02Tracking without Label: Unsupervised Multiple Object Tracking via Contrastive Similarity LearningSha Meng et.al.2309.00942:mortar_board:None
2023-09-01Object-Centric Multiple Object TrackingZixu Zhao et.al.2309.00233:mortar_board:Code
2023-08-31Illumination Distillation Framework for Nighttime Person Re-Identification and A New BenchmarkAndong Lu et.al.2308.16486:mortar_board:Code
2023-08-30Occlusion-Aware Detection and Re-ID Calibrated Network for Multi-Object TrackingYukun Su et.al.2308.15795:mortar_board:None
2023-08-29Learning Cross-modality Information Bottleneck Representation for Heterogeneous Person Re-IdentificationHaichao Shi et.al.2308.15063:mortar_board:None
2023-08-27Semantic-aware Consistency Network for Cloth-changing Person Re-IdentificationPeini Guo et.al.2308.14113:mortar_board:Code
2023-08-25ReST: A Reconfigurable Spatial-Temporal Graph Model for Multi-Camera Multi-Object TrackingCheng-Che Cheng et.al.2308.13229:mortar_board:Code
2023-08-23Camera-Driven Representation Learning for Unsupervised Domain Adaptive Person Re-identificationGeon Lee et.al.2308.11901:mortar_board:None
2023-08-23HashReID: Dynamic Network with Binary Codes for Efficient Person Re-identificationKshitij Nikhal et.al.2308.11900:mortar_board:None
2023-08-23Multi-object Detection, Tracking and Prediction in Rugged Dynamic EnvironmentsShixing Huang et.al.2308.11870:mortar_board:None
2023-08-22(Un)fair Exposure in Deep Face Rankings at a DistanceAndrea Atzori et.al.2308.11732:mortar_board:None
2023-08-22Delving into Motion-Aware Matching for Monocular 3D Object TrackingKuan-Chih Huang et.al.2308.11607:mortar_board:Code
2023-08-22TrackFlow: Multi-Object Tracking with Normalizing FlowsGianluca Mancusi et.al.2308.11513:mortar_board:None
2023-08-22TOPIC: A Parallel Association Paradigm for Multi-Object Tracking under Complex Motions and Diverse ScenesXiaoyan Cao et.al.2308.11157:mortar_board:Code
2023-08-22Anonymity at Risk? Assessing Re-Identification Capabilities of Large Language ModelsAlex Nyffenegger et.al.2308.11103:mortar_board:Code
2023-08-21Rethinking Person Re-identification from a Projection-on-Prototypes PerspectiveQizao Wang et.al.2308.10717:mortar_board:None
2023-08-21Color Prompting for Data-Free Continual Unsupervised Domain Adaptive Person Re-IdentificationJianyang Gu et.al.2308.10716:mortar_board:Code
2023-08-21Exploring Fine-Grained Representation and Recomposition for Cloth-Changing Person Re-IdentificationQizao Wang et.al.2308.10692:mortar_board:None
2023-08-21Learning Clothing and Pose Invariant 3D Shape Representation for Long-Term Person Re-IdentificationFeng Liu et.al.2308.10658:mortar_board:None
2023-08-19Noisy-Correspondence Learning for Text-to-Image Person Re-identificationYang Qin et.al.2308.09911:mortar_board:Code
2023-08-19LEGO: Learning and Graph-Optimized Modular Tracker for Online Multi-Object Tracking with Point CloudsZhenrong Zhang et.al.2308.09908:mortar_board:None
2023-08-19DiffusionTrack: Diffusion Model For Multi-Object TrackingRun Luo et.al.2308.09905:mortar_board:None
2023-08-17Identity-Aware Semi-Supervised Learning for Comic Character Re-IdentificationGürkan Soykan et.al.2308.09096:mortar_board:None
2023-08-17Identity-Seeking Self-Supervised Representation Learning for Generalizable Person Re-identificationZhaopeng Dou et.al.2308.08887:mortar_board:Code
2023-08-17BOTT: Box Only Transformer Tracker for 3D Object TrackingLubing Zhou et.al.2308.08753:mortar_board:None
2023-08-16Privacy at Risk: Exploiting Similarities in Health Data for Identity InferenceLucas Lange et.al.2308.08310:mortar_board:None
2023-08-15AttMOT: Improving Multiple-Object Tracking by Introducing Auxiliary Pedestrian AttributesYunhao Li et.al.2308.07537:mortar_board:None
2023-08-14FOLT: Fast Multiple Object Tracking from UAV-captured Videos Based on Optical FlowMufeng Yao et.al.2308.07207:mortar_board:None
2023-08-123DMOTFormer: Graph Transformer for Online 3D Multi-Object TrackingShuxiao Ding et.al.2308.06635:mortar_board:Code
2023-08-11Combining feature aggregation and geometric similarity for re-identification of patterned animalsVeikka Immonen et.al.2308.06335:mortar_board:None
2023-08-11Collaborative Tracking Learning for Frame-Rate-Insensitive Multi-Object TrackingYiheng Liu et.al.2308.05911:mortar_board:None
2023-08-09An End-to-End Framework of Road User Detection, Tracking, and Prediction from Monocular ImagesHao Cheng et.al.2308.05026:mortar_board:None
2023-08-09Tracking Players in a Badminton Court by Two CamerasYoung-Ching Chou et.al.2308.04872:mortar_board:None
2023-08-081st Place Solution for CVPR2023 BURST Long Tail and Open World ChallengesKaer Huang et.al.2308.04598:mortar_board:None
2023-08-08Person Re-Identification without Identification via Event AnonymizationShafiq Ahmad et.al.2308.04402:mortar_board:Code
2023-08-08Multi-level Map Construction for Dynamic ScenesXinggang Hu et.al.2308.04000:mortar_board:Code
2023-08-07Video-based Person Re-identification with Long Short-Term Representation LearningXuehu Liu et.al.2308.03703:mortar_board:None
2023-08-07Part-Aware Transformer for Generalizable Person Re-identificationHao Ni et.al.2308.03322:mortar_board:None
2023-08-04Exploring Part-Informed Visual-Language Learning for Person Re-IdentificationYin Lin et.al.2308.02738:mortar_board:None
2023-08-03ReIDTrack: Multi-Object Track and Segmentation Without MotionKaer Huang et.al.2308.01622:mortar_board:None
2023-08-02A Hybrid Approach To Real-Time Multi-Object TrackingVincenzo Mariano Scarrica et.al.2308.01248:mortar_board:None
2023-08-02Towards Discriminative Representation with Meta-learning for Colonoscopic Polyp Re-IdentificationSuncheng Xiang et.al.2308.00929:mortar_board:None
2023-08-01Hybrid-SORT: Weak Cues Matter for Online Multi-Object TrackingMingzhan Yang et.al.2308.00783:mortar_board:Code
2023-08-01Loading of a large Yb MOT on the 1^1S0_0-1^1P1_1 transitionHector Letellier et.al.2308.00387:mortar_board:None
2023-08-01Advancing Frame-Dropping in Multi-Object Tracking-by-Detection Systems Through Event-Based Detection TriggeringMatti Henning et.al.2308.00330:mortar_board:None
2023-07-31A Trajectory K-Anonymity Model Based on Point Density and PartitionWanshu Yu et.al.2307.16849:mortar_board:None
2023-07-31Poly-MOT: A Polyhedral Framework For 3D Multi-Object TrackingXiaoyu Li et.al.2307.16675:mortar_board:Code
2023-07-28MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object TrackingRuopeng Gao et.al.2307.15700:mortar_board:None
2023-07-28Uncertainty-aware Unsupervised Multi-Object TrackingKai Liu et.al.2307.15409:mortar_board:None
2023-07-27The detection and rectification for identity-switch based on unfalsified controlJunchao Huang et.al.2307.14591:mortar_board:None
2023-07-26Large-scale Fully-Unsupervised Re-IdentificationGabriel Bertocco et.al.2307.14278:mortar_board:None
2023-07-24Hierarchical Skeleton Meta-Prototype Contrastive Learning with Hard Skeleton Mining for Unsupervised Person Re-IdentificationHaocong Rao et.al.2307.12917:mortar_board:Code
2023-07-24CTVIS: Consistent Training for Online Video Instance SegmentationKaining Ying et.al.2307.12616:mortar_board:Code
2023-07-20Learning Discriminative Visual-Text Representation for Polyp Re-IdentificationSuncheng Xiang et.al.2307.10625:mortar_board:Code
2023-07-18Balancing Privacy and Progress in Artificial Intelligence: Anonymization in Histopathology for Biomedical Research and EducationNeel Kanwal et.al.2307.09426:mortar_board:None
2023-07-18Pixel-wise Graph Attention Networks for Person Re-identificationWenyu Zhang et.al.2307.09183:mortar_board:Code
2023-07-17Bridging the Gap: Multi-Level Cross-Modality Joint Alignment for Visible-Infrared Person Re-IdentificationTengfei Liang et.al.2307.08316:mortar_board:None
2023-07-14Implementing an electronic sideband offset lock for precision spectroscopy in radiumTenzin Rabga et.al.2307.07646:mortar_board:None
2023-07-14Erasing, Transforming, and Noising Defense Network for Occluded Person Re-IdentificationNeng Dong et.al.2307.07187:mortar_board:None
2023-07-14TVPR: Text-to-Video Person Retrieval and a New BenchmarkFan Ni et.al.2307.07184:mortar_board:None
2023-07-13Domain-adaptive Person Re-identification without Cross-camera Paired SamplesHuafeng Li et.al.2307.06533:mortar_board:None
2023-07-12Multi-Object Tracking as Attention MechanismHiroshi Fukui et.al.2307.05874:mortar_board:None
2023-07-09HA-ViD: A Human Assembly Video Dataset for Comprehensive Assembly Knowledge UnderstandingHao Zheng et.al.2307.05721:mortar_board:None
2023-07-11High density loading and collisional loss of laser cooled molecules in an optical trapVarun Jorapur et.al.2307.05347:mortar_board:None
2023-07-11MinkSORT: A 3D deep feature extractor using sparse convolutions to improve 3D multi-object tracking in greenhouse tomato plantsDavid Rapado-Rincon et.al.2307.05219:mortar_board:None
2023-07-08Adversarial Self-Attack Defense and Spatial-Temporal Relation Mining for Visible-Infrared Video Person Re-IdentificationHuafeng Li et.al.2307.03903:mortar_board:None
2023-07-06Adaptive Generation of Privileged Intermediate Information for Visible-Infrared Person Re-IdentificationMahdi Alehdaghi et.al.2307.03240:mortar_board:None
2023-07-06Smartphones in a Microwave: Formal and Experimental Feasibility Study on Fingerprinting the Corona-Warn-AppHenrik Graßhoff et.al.2307.02931:mortar_board:None
2023-07-05Multi Object Tracking for Predictive Collision AvoidanceBruk Gebregziabher et.al.2307.02161:mortar_board:None
2023-07-01Improving CNN-based Person Re-identification using score NormalizationAmmar Chouchane et.al.2307.00397:mortar_board:None
2023-06-29MotionTrack: End-to-End Transformer-based Multi-Object Tracing with LiDAR-Camera FusionCe Zhang et.al.2306.17000:mortar_board:None
2023-06-29Trajectory Poisson multi-Bernoulli mixture filter for traffic monitoring using a droneÁngel F. García-Fernández et.al.2306.16890:mortar_board:None
2023-06-27DCP-NAS: Discrepant Child-Parent Neural Architecture Search for 1-bit CNNsYanjing Li et.al.2306.15390:mortar_board:None
2023-06-27On Gibbs Sampling Architecture for Labeled Random Finite Sets Multi-Object TrackingAnthony Trezza et.al.2306.15135:mortar_board:None
2023-06-25A Novel Dual-pooling Attention Module for UAV Vehicle Re-identificationXiaoyan Guo et.al.2306.14104:mortar_board:None
2023-06-23Segmentation and Tracking of Vegetable Plants by Exploiting Vegetable Shape Feature for Precision Spray of Agricultural RobotsNan Hu et.al.2306.13518:mortar_board:Code
2023-06-23Deep macroscopic pure-optical potential for laser cooling and trapping of neutral atoms without using a magneto-optical trapO. N. Prudnikov et.al.2306.13294:mortar_board:None
2023-06-22Iterative Scale-Up ExpansionIoU and Deep Features Association for Multi-Object Tracking in SportsHsiang-Wei Huang et.al.2306.13074:mortar_board:Code
2023-06-21Generalizable Metric Network for Cross-domain Person Re-identificationLei Qi et.al.2306.11991:mortar_board:None
2023-06-20Data-Driven but Privacy-Conscious: Pedestrian Dataset De-identification via Full-Body Person SynthesisMaxim Maximov et.al.2306.11710:mortar_board:None
2023-06-16Lightweight Attribute Localizing Models for Pedestrian Attribute RecognitionAshish Jha et.al.2306.09822:mortar_board:Code
2023-06-16UTOPIA: Unconstrained Tracking Objects without Preliminary Examination via Cross-Domain AdaptationPha Nguyen et.al.2306.09613:mortar_board:None
2023-06-15Knowledge Assembly: Semi-Supervised Multi-Task Learning from Multiple Datasets with Disjoint LabelsFederica Spinola et.al.2306.08839:mortar_board:None
2023-06-15Graph Convolution Based Efficient Re-Ranking for Visual RetrievalYuqi Zhang et.al.2306.08792:mortar_board:Code
2023-06-14Self-Supervised Polyp Re-Identification in ColonoscopyYotam Intrator et.al.2306.08591:mortar_board:None
2023-06-13Marking anything: application of point cloud in extracting video target featuresXiangchun Xu et.al.2306.07559:mortar_board:None
2023-06-13Retrieve Anyone: A General-purpose Person Re-identification Task with InstructionsWeizhen He et.al.2306.07520:mortar_board:None
2023-06-10Vista-Morph: Unsupervised Image Registration of Visible-Thermal Facial PairsCatherine Ordun et.al.2306.06505:mortar_board:None
2023-06-09TrajectoryFormer: 3D Object Tracking Transformer with Predictive Trajectory HypothesesXuesong Chen et.al.2306.05888:mortar_board:None
2023-06-09A Dual-Source Attention Transformer for Multi-Person Pose TrackingAndreas Doering et.al.2306.05807:mortar_board:None
2023-06-08Tracking Objects with 3D Representation from VideosJiawei He et.al.2306.05416:mortar_board:None
2023-06-08SparseTrack: Multi-Object Tracking by Performing Scene Decomposition based on Pseudo-DepthZelin Liu et.al.2306.05238:mortar_board:Code
2023-06-08Population-Based Evolutionary Gaming for Unsupervised Person Re-identificationYunpeng Zhai et.al.2306.05236:mortar_board:None
2023-06-08On the Robustness of Topics API to a Re-Identification AttackNikhil Jha et.al.2306.05094:mortar_board:Code
2023-06-06Real-Time Online Unsupervised Domain Adaptation for Real-World Person Re-identificationChristopher Neff et.al.2306.03993:mortar_board:None
2023-06-05Differentially Private Cross-camera Person Re-identificationLucas Maris et.al.2306.02765:mortar_board:None
2023-06-05MotionTrack: Learning Motion Predictor for Multiple Object TrackingChangcheng Xiao et.al.2306.02585:mortar_board:None
2023-06-02Recent Advances of Local Mechanisms in Computer Vision: A Survey and Outlook of Recent WorkQiangchang Wang et.al.2306.01929:mortar_board:None
2023-06-02Privacy Distillation: Reducing Re-identification Risk of Multimodal Diffusion ModelsVirginia Fernandez et.al.2306.01322:mortar_board:None
2023-06-01Design and simulation of a source of cold cadmium for atom interferometrySatvika Bandarupally et.al.2306.00782:mortar_board:None
2023-05-31Dictionary Learning under Symmetries via Group RepresentationsSubhroshekhar Ghosh et.al.2305.19557:mortar_board:None
2023-05-28Z-GMOT: Zero-shot Generic Multiple Object TrackingKim Hoang Tran et.al.2305.17648:mortar_board:None
2023-05-26Linear Object Detection in Document Images using Multiple Object TrackingPhilippe Bernet et.al.2305.16968:mortar_board:None
2023-05-26Fast refacing of MR images with a generative neural network lowers re-identification risk and preserves volumetric consistencyNataliia Molchanova et.al.2305.16922:mortar_board:Code
2023-05-26Blue-detuned molecular magneto-optical trap schemes based on bayesian optimizationS. Xu et.al.2305.16576:mortar_board:None
2023-05-26Tree-Based Diffusion Schrödinger Bridge with Applications to Wasserstein BarycentersMaxence Noble et.al.2305.16557:mortar_board:Code
2023-05-25Camera-Incremental Object Re-Identification with Identity Knowledge EvolutionHantao Yao et.al.2305.15909:mortar_board:Code
2023-05-25Text-to-Motion Retrieval: Towards Joint Understanding of Human Motion Data and Natural LanguageNicola Messina et.al.2305.15842:mortar_board:Code
2023-05-25Multi-query Vehicle Re-identification: Viewpoint-conditioned Network, Unified Dataset and New MetricAihua Zheng et.al.2305.15764:mortar_board:None
2023-05-25Dynamic Enhancement Network for Partial Multi-modality Person Re-identificationAihua Zheng et.al.2305.15762:mortar_board:None
2023-05-24Reducing Rydberg state dc polarizability by microwave dressingJ. C. Bohorquez et.al.2305.15200:mortar_board:None
2023-05-23MOTRv3: Release-Fetch Supervision for End-to-End Multi-Object TrackingEn Yu et.al.2305.14298:mortar_board:None
2023-05-23Flare-Aware Cross-modal Enhancement Network for Multi-spectral Vehicle Re-identificationAihua Zheng et.al.2305.13659:mortar_board:None
2023-05-23MaskCL: Semantic Mask-Driven Contrastive Learning for Unsupervised Person Re-Identification with Clothes ChangeMingkun Li et.al.2305.13600:mortar_board:None
2023-05-22Bridging the Gap Between End-to-end and Non-End-to-end Multi-Object TrackingFeng Yan et.al.2305.12724:mortar_board:Code
2023-05-22Unsupervised Visible-Infrared Person ReID by Collaborative Learning with Neighbor-Guided Label RefinementDe Cheng et.al.2305.12711:mortar_board:None
2023-05-22Efficient Bilateral Cross-Modality Cluster Matching for Unsupervised Visible-Infrared Person ReIDDe cheng et.al.2305.12673:mortar_board:None
2023-05-17Towards Object Re-Identification from Point Clouds for 3D MOTBenjamin Thérien et.al.2305.10210:mortar_board:None
2023-05-17S3^3Track: Self-supervised Tracking with Soft Assignment FlowFatemeh Azimi et.al.2305.09981:mortar_board:None
2023-05-16SCTracker: Multi-object tracking with shape and confidence constraintsHuan Mao et.al.2305.09523:mortar_board:None
2023-05-15DopUS-Net: Quality-Aware Robotic Ultrasound Imaging based on Doppler SignalZhongliang Jiang et.al.2305.08938:mortar_board:Code
2023-05-15GeoMAE: Masked Geometric Target Prediction for Self-supervised Point Cloud Pre-TrainingXiaoyu Tian et.al.2305.08808:mortar_board:Code
2023-05-15Non-Separable Multi-Dimensional Network Flows for Visual ComputingViktoria Ehm et.al.2305.08628:mortar_board:None
2023-05-12Grating magneto-optical traps with complicated level structuresD. S. Barker et.al.2305.07732:mortar_board:None
2023-05-10Clothes-Invariant Feature Learning by Causal Intervention for Clothes-Changing Person Re-identificationXulin Li et.al.2305.06145:mortar_board:None
2023-05-09MoT: Pre-thinking and Recalling Enable ChatGPT to Self-Improve with Memory-of-ThoughtsXiaonan Li et.al.2305.05181:mortar_board:None
2023-05-08Simulations of a frequency-chirped magneto-optical trap of MgFKayla J. Rodriguez et.al.2305.04879:mortar_board:None
2023-05-05A Race Track Trapped-Ion Quantum ProcessorS. A. Moses et.al.2305.03828:mortar_board:Code
2023-05-03Imaging a 6^6Li Atom In An Optical Tweezer 2000 Times with ΛΛ-Enhanced Gray MolassesKarl N. Blodgett et.al.2305.02405:mortar_board:None
2023-04-30LIMOT: A Tightly-Coupled System for LiDAR-Inertial Odometry and Multi-Object TrackingZhongyang Zhu et.al.2305.00406:mortar_board:None
2023-04-29Fusion for Visual-Infrared Person ReID in Real-World Surveillance Using Corrupted Multimodal DataArthur Josi et.al.2305.00320:mortar_board:Code
2023-04-27Deeply-Coupled Convolution-Transformer with Spatial-temporal Complementary Learning for Video-based Person Re-identificationXuehu Liu et.al.2304.14122:mortar_board:None
2023-04-25Self-Supervised Multi-Object Tracking From Consistency Across TimescalesChristopher Lang et.al.2304.13147:mortar_board:None
2023-04-25Pseudo Labels Refinement with Intra-camera Similarity for Unsupervised Person Re-identificationPengna Li et.al.2304.12634:mortar_board:None
2023-04-24MOTLEE: Distributed Mobile Multi-Object Tracking with Localization Error EliminationMason B. Peterson et.al.2304.12175:mortar_board:None
2023-04-19Learning Robust Visual-Semantic Embedding for Generalizable Person Re-identificationSuncheng Xiang et.al.2304.09498:mortar_board:Code
2023-04-19Enhancing Multi-Camera People Tracking with Anchor-Guided Clustering and Spatio-Temporal Consistency ID Re-AssignmentHsiang-Wei Huang et.al.2304.09471:mortar_board:Code
2023-04-18You Only Need Two Detectors to Achieve Multi-Modal 3D Multi-Object TrackingXiyang Wang et.al.2304.08709:mortar_board:Code
2023-04-17OVTrack: Open-Vocabulary Multiple Object TrackingSiyuan Li et.al.2304.08408:mortar_board:None
2023-04-17The Impact of Frame-Dropping on Performance and Energy Consumption for Multi-Object TrackingMatti Henning et.al.2304.08152:mortar_board:None
2023-04-16Ontology for Healthcare Artificial Intelligence Privacy in BrazilTiago Andres Vaz et.al.2304.07889:mortar_board:None
2023-04-16Bent & Broken Bicycles: Leveraging synthetic data for damaged object re-identificationLuca Piano et.al.2304.07883:mortar_board:None
2023-04-16A Novel end-to-end Framework for Occluded Pixel Reconstruction with Spatio-temporal Features for Improved Person Re-identificationPrathistith Raj Medi et.al.2304.07721:mortar_board:None
2023-04-16Handling Heavy Occlusion in Dense Crowd Tracking by Focusing on the HeadsYu Zhang et.al.2304.07705:mortar_board:None
2023-04-12Measuring Re-identification RiskCJ Carey et.al.2304.07210:mortar_board:Code
2023-04-10Analysing Fairness of Privacy-Utility Mobility ModelsYuting Zhan et.al.2304.06469:mortar_board:None
2023-04-12TopTrack: Tracking Objects By Their TopJacob Meilleur et.al.2304.06114:mortar_board:Code
2023-04-12Learning Transferable Pedestrian Representation from Multimodal Information SupervisionLiping Bao et.al.2304.05554:mortar_board:Code
2023-04-11SportsMOT: A Large Multi-Object Tracking Dataset in Multiple Sports ScenesYutao Cui et.al.2304.05170:mortar_board:Code
2023-04-10Multi-Object Tracking by Iteratively Associating Detections with Uniform Appearance for Trawl-Based Fishing Bycatch MonitoringCheng-Yen Yang et.al.2304.04816:mortar_board:None
2023-04-09Shape-Erased Feature Learning for Visible-Infrared Person Re-IdentificationJiawei Feng et.al.2304.04205:mortar_board:Code
2023-04-07PSLT: A Light-weight Vision Transformer with Ladder Self-Attention and Progressive ShiftGaojie Wu et.al.2304.03481:mortar_board:None
2023-04-04PartMix: Regularization Strategy to Learn Part Discovery for Visible-Infrared Person Re-identificationMinsu Kim et.al.2304.01537:mortar_board:None
2023-04-04Attention Map Guided Transformer Pruning for Edge DeviceJunzhu Mao et.al.2304.01452:mortar_board:Code
2023-04-03A Scale-Invariant Trajectory Simplification Method for Efficient Data Collection in VideosYang Liu et.al.2304.01340:mortar_board:Code
2023-04-03Navigating to Objects Specified by ImagesJacob Krantz et.al.2304.01192:mortar_board:None
2023-03-31Adaptive Sparse Pairwise Loss for Object Re-IdentificationXiao Zhou et.al.2303.18247:mortar_board:Code
2023-03-27PADME-SoSci: A Platform for Analytics and Distributed Machine Learning for the Social SciencesZeyd Boukhers et.al.2303.18200:mortar_board:None
2023-03-30Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual TasksWeihua Chen et.al.2303.17602:mortar_board:Code
2023-03-30Streaming Video ModelYucheng Zhao et.al.2303.17228:mortar_board:Code
2023-03-28Large-scale Training Data Search for Object Re-identificationYue Yao et.al.2303.16186:mortar_board:None
2023-03-28Mask-Free Video Instance SegmentationLei Ke et.al.2303.15904:mortar_board:Code
2023-03-27Learnable Graph Matching: A Practical Paradigm for Data AssociationJiawei He et.al.2303.15414:mortar_board:Code
2023-03-27ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every Detection BoxYifu Zhang et.al.2303.15334:mortar_board:None
2023-03-26SDTracker: Synthetic Data Based Multi-Object TrackingYingda Guan et.al.2303.14653:mortar_board:None
2023-03-26MRCN: A Novel Modality Restitution and Compensation Network for Visible-Infrared Person Re-identificationYukang Zhang et.al.2303.14626:mortar_board:None
2023-03-25Diverse Embedding Expansion Network and Low-Light Cross-Modality Benchmark for Visible-Infrared Person Re-identificationYukang Zhang et.al.2303.14481:mortar_board:Code
2023-03-25Collaborative Multi-Object Tracking with Conformal Uncertainty PropagationSanbao Su et.al.2303.14346:mortar_board:None
2023-03-24A CNN-LSTM Architecture for Marine Vessel Track Association Using Automatic Identification System (AIS) DataMd Asif Bin Syed et.al.2303.14068:mortar_board:None
2023-03-24Multimodal Adaptive Fusion of Face and Gait Features using Keyless attention based Deep Neural Networks for Human IdentificationAshwin Prakash et.al.2303.13814:mortar_board:None
2023-03-22Man vs the machine: The Struggle for Effective Text Anonymisation in the Age of Large Language ModelsConstantinos Patsakis et.al.2303.12429:mortar_board:None
2023-03-21OmniTracker: Unifying Object Tracking by Tracking-with-DetectionJunke Wang et.al.2303.12079:mortar_board:None
2023-03-21CLIP-ReIdent: Contrastive Training for Player Re-IdentificationKonrad Habel et.al.2303.11855:mortar_board:None
2023-03-21Deep Learning for Video-based Person Re-Identification: A SurveyKhawar Islam et.al.2303.11332:mortar_board:None
2023-03-20Attention Disturbance and Dual-Path Constraint Network for Occluded Person Re-IdentificationJiaer Xia et.al.2303.10976:mortar_board:None
2023-03-20Open-World Pose Transfer via Sequential Test-Time AdaptionJunyang Chen et.al.2303.10945:mortar_board:None
2023-03-18Report of the Medical Image De-Identification (MIDI) Task Group – Best Practices and RecommendationsDavid A. Clunie et.al.2303.10473:mortar_board:None
2023-03-18MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object TrackingZheng Qin et.al.2303.10404:mortar_board:None
2023-03-17GOOD: General Optimization-based Fusion for 3D Object Detection via LiDAR-Camera Object CandidatesBingqi Shen et.al.2303.09800:mortar_board:None
2023-03-16Rt-Track: Robust Tricks for Multi-Pedestrian TrackingYukuan Zhang et.al.2303.09668:mortar_board:None
2023-03-15Mining False Positive Examples for Text-Based Person Re-identificationWenhao Xu et.al.2303.08466:mortar_board:Code
2023-03-15Real-time Multi-Object Tracking Based on Bi-directional MatchingHuilan Luo et.al.2303.08444:mortar_board:None
2023-03-13MSINet: Twins Contrastive Search of Multi-Scale Interaction for Object ReIDJianyang Gu et.al.2303.07065:mortar_board:None
2023-03-13TranSG: Transformer-Based Skeleton Graph Prototype Contrastive Learning with Structure-Trajectory Prompted Reconstruction for Person Re-IdentificationHaocong Rao et.al.2303.06819:mortar_board:Code
2023-03-13Dynamic Clustering and Cluster Contrastive Learning for Unsupervised Person Re-identificationZiqi He et.al.2303.06810:mortar_board:None
2023-03-11PRSNet: A Masked Self-Supervised Learning Pedestrian Re-Identification MethodZhijie Xiao et.al.2303.06330:mortar_board:Code
2023-03-09A 2D MOT of dysprosium atoms as a compact source for efficient loading of a narrow-line 3D MOTShuwei Jin et.al.2303.05191:mortar_board:None
2023-03-06Memory Maps for Video Object Detection and Tracking on UAVsBenjamin Kiefer et.al.2303.03508:mortar_board:None
2023-03-06Referring Multi-Object TrackingDongming Wu et.al.2303.03366:mortar_board:Code
2023-03-06Efficient Skill Acquisition for Complex Manipulation Tasks in Obstructed EnvironmentsJun Yamada et.al.2303.03365:mortar_board:None
2023-03-06UniHCP: A Unified Model for Human-Centric PerceptionsYuanzheng Ci et.al.2303.02936:mortar_board:None
2023-03-033D Multi-Object Tracking Based on Uncertainty-Guided Data AssociationJiawei He et.al.2303.01786:mortar_board:Code
2023-03-03Feature Completion Transformer for Occluded Person Re-identificationTao Wang et.al.2303.01656:mortar_board:None
2023-02-28DFR-FastMOT: Detection Failure Resistant Tracker for Fast Multi-Object Tracking Based on Sensor FusionMohamed Nagy et.al.2302.14807:mortar_board:Code
2023-02-28Membership Inference Attack for Beluga Whales DiscriminationVoncarlos Marcelo Araújo et.al.2302.14769:mortar_board:None
2023-02-28Focus On Details: Online Multi-object Tracking with Diverse Fine-grained RepresentationHao Ren et.al.2302.14589:mortar_board:None
2023-02-28A Little Bit Attention Is All You Need for Person Re-IdentificationMarkus Eisenbach et.al.2302.14574:mortar_board:None
2023-02-28Mesh-SORT: Simple and effective of location-wise trackerZongTan Li et.al.2302.14415:mortar_board:None
2023-02-28DC-Former: Diverse and Compact Transformer for Person Re-IdentificationWen Li et.al.2302.14335:mortar_board:Code
2023-02-28Ultra-high vacuum pressure measurement using cold atomsS. Supakar et.al.2302.14305:mortar_board:None
2023-02-25DeepBrainPrint: A Novel Contrastive Framework for Brain MRI Re-IdentificationLemuel Puglisi et.al.2302.13057:mortar_board:None
2023-02-23Deep OC-SORT: Multi-Pedestrian Tracking by Adaptive Re-IdentificationGerard Maggiolino et.al.2302.11813:mortar_board:Code
2023-02-21BrackishMOT: The Brackish Multi-Object Tracking DatasetMalte Pedersen et.al.2302.10645:mortar_board:Code
2023-02-20On the Stability and Generalization of Triplet LearningJun Chen et.al.2302.09815:mortar_board:None
2023-02-17A Review on Generative Adversarial Networks for Data Augmentation in Person Re-Identification SystemsVictor Uc-Cetina et.al.2302.09119:mortar_board:None
2023-02-17Self-Supervised Representation Learning from Temporal Ordering of Automated Driving SequencesChristopher Lang et.al.2302.09043:mortar_board:None
2023-02-16Visible-Infrared Person Re-Identification via Patch-Mixed Cross-Modality LearningZhihao Qian et.al.2302.08212:mortar_board:None
2023-02-15DIVOTrack: A Novel Dataset and Baseline Method for Cross-View Multi-Object Tracking in DIVerse Open ScenesShenghao Hao et.al.2302.07676:mortar_board:Code
2023-02-11DaliID: Distortion-Adaptive Learned Invariance for Identification ModelsWes Robbins et.al.2302.05753:mortar_board:None
2023-02-11ConMAE: Contour Guided MAE for Unsupervised Vehicle Re-IdentificationJing Yang et.al.2302.05673:mortar_board:None
2023-02-10Tensor-to-scalar ratio forecasts for extended LiteBIRD frequency configurationsU. Fuskeland et.al.2302.05228:mortar_board:None
2023-02-09Deep Intra-Image Contrastive Learning for Weakly Supervised One-Step Person SearchJiabei Wang et.al.2302.04607:mortar_board:Code
2023-02-07Standing Between Past and Future: Spatio-Temporal Modeling for Multi-Camera 3D Multi-Object TrackingZiqi Pang et.al.2302.03802:mortar_board:Code
2023-02-07Self-Supervised Unseen Object Instance Segmentation via Long-Term Robot InteractionYangxiao Lu et.al.2302.03793:mortar_board:None
2023-02-05Spatio-Temporal Point Process for Multiple Object TrackingTao Wang et.al.2302.02444:mortar_board:None
2023-02-04X-ReID: Cross-Instance Transformer for Identity-Level Person Re-IdentificationLeqi Shen et.al.2302.02075:mortar_board:None
2023-02-03Spectral Aware Softmax for Visible-Infrared Person Re-IdentificationLei Tan et.al.2302.01512:mortar_board:None
2023-02-02Exploring Invariant Representation for Visible-Infrared Person Re-IdentificationLei Tan et.al.2302.00884:mortar_board:None
2023-01-29Unsupervised Domain Adaptation on Person Re-Identification via Dual-level Asymmetric Mutual LearningQiong Wu et.al.2301.12439:mortar_board:None
2023-01-25An Efficient Semi-Automated Scheme for Infrastructure LiDAR AnnotationAotian Wu et.al.2301.10732:mortar_board:None
2023-01-25Tracking Different Ant Species: An Unsupervised Domain Adaptation Framework and a Dataset for Multi-object TrackingChamath Abeysinghe et.al.2301.10559:mortar_board:None
2023-01-24A Linear Reconstruction Approach for Attribute Inference Attacks against Synthetic DataMeenatchi Sundaram Muthu Selva Annamalai et.al.2301.10053:mortar_board:None
2023-01-23Illumination Variation Correction Using Image Synthesis For Unsupervised Domain Adaptive Person Re-IdentificationJiaqi Guo et.al.2301.09702:mortar_board:None
2023-01-23Triplet Contrastive Learning for Unsupervised Vehicle Re-identificationFei Shen et.al.2301.09498:mortar_board:Code
2023-01-18Robust Knowledge Adaptation for Federated Unsupervised Person ReIDJianfeng Weng et.al.2301.07320:mortar_board:None
2023-01-17Database Matching Under Noisy Synchronization ErrorsSerhat Bakirtas et.al.2301.06796:mortar_board:None
2023-01-16Meta Generative Attack on Person ReidentificationA V Subramanyam et.al.2301.06286:mortar_board:None
2023-01-14Arcade Processes for Informed Martingale Interpolation and TransportGeorges Kassis et.al.2301.05936:mortar_board:None
2023-01-05Learning Feature Recovery Transformer for Occluded Person Re-identificationBoqiang Xu et.al.2301.01879:mortar_board:Code
2023-01-02Learning Invariance from Generated Variance for Unsupervised Person Re-identificationHao Chen et.al.2301.00725:mortar_board:Code
2023-01-02A contrastive learning approach for individual re-identification in a wild fish populationØrjan Langøy Olsen et.al.2301.00596:mortar_board:None
2023-01-02Multi-Stage Spatio-Temporal Aggregation Transformer for Video Person Re-identificationZiyi Tang et.al.2301.00531:mortar_board:None
2022-12-31Tracking Passengers and Baggage Items using Multiple Overhead Cameras at Security CheckpointsAbubakar Siddique et.al.2301.00190:mortar_board:None
2022-12-30Unsupervised 4D LiDAR Moving Object Segmentation in Stationary Settings with Multivariate Occupancy Time SeriesThomas Kreutz et.al.2212.14750:mortar_board:Code
2022-12-30Multisensor Multiobject Tracking With High-Dimensional Object StatesWenyu Zhang et.al.2212.14556:mortar_board:None
2022-12-30Estimating Latent Population Flows from Aggregated Data via Inversing Multi-Marginal Optimal TransportSikun Yang et.al.2212.14527:mortar_board:None
2022-12-28Joint Discriminative and Metric Embedding Learning for Person Re-IdentificationSinan Sabri et.al.2212.14107:mortar_board:None
2022-12-24DiP: Learning Discriminative Implicit Parts for Person Re-IdentificationDengjie Li et.al.2212.13906:mortar_board:Code
2022-12-25Human Health Indicator Prediction from Gait VideoZiqing Li et.al.2212.12948:mortar_board:None
2022-12-25Understanding Ethics, Privacy, and Regulations in Smart Video Surveillance for Public SafetyBabak Rahimi Ardabili et.al.2212.12936:mortar_board:None
2022-12-23Mesh of Things (MoT) Network-Driven Anomaly Detection in Connected ObjectsRathinamala Vijay et.al.2212.12221:mortar_board:None
2022-12-22Spatio-Visual Fusion-Based Person Re-Identification for Overhead Fisheye ImagesMertcan Cokbas et.al.2212.11477:mortar_board:None
2022-12-21Photonic integrated beam delivery in a rubidium 3D magneto-optical trapAndrei Isichenko et.al.2212.11417:mortar_board:None
2022-12-20Dain’s invariant for black hole initial dataRobert Sansom et.al.2212.10270:mortar_board:None
2022-12-20Tracking by Associating ClipsSanghyun Woo et.al.2212.10149:mortar_board:None
2022-12-20On the Applicability of Synthetic Data for Re-IdentificationJérôme Rutinowski et.al.2212.10105:mortar_board:Code
2022-12-20Benchmarking person re-identification datasets and approaches for practical real-world implementationsJose Huaman et.al.2212.09981:mortar_board:Code
2022-12-16Feature Disentanglement Learning with Switching and Aggregation for Video-based Person Re-IdentificationMinjung Kim et.al.2212.09498:mortar_board:None
2022-12-17A Brief Survey on Person Recognition at a DistanceChrisopher B. Nalty et.al.2212.08969:mortar_board:None
2022-12-16Nonequilibrium steady state in a large magneto-optical trapMarius Gaudesius et.al.2212.08705:mortar_board:None
2022-12-16Detection-aware multi-object tracking evaluationJuan C. SanMiguel et.al.2212.08536:mortar_board:None
2022-12-16Neural Enhanced Belief Propagation for Multiobject TrackingMingchao Liang et.al.2212.08340:mortar_board:None
2022-12-15Writer Retrieval and Writer Identification in Greek PapyriVincent Christlein et.al.2212.07664:mortar_board:None
2022-12-15Solve the Puzzle of Instance Segmentation in Videos: A Weakly Supervised Framework with Spatio-Temporal CollaborationLiqi Yan et.al.2212.07592:mortar_board:None
2022-12-14Blue-Detuned Magneto-Optical Trap of MoleculesJustin J. Burau et.al.2212.07472:mortar_board:None
2022-12-12CountingMOT: Joint Counting, Detection and Re-Identification for Multiple Object TrackingWeihong Ren et.al.2212.05861:mortar_board:None
2022-12-11Mutimodal Ranking Optimization for Heterogeneous Face Re-identificationHui Hu et.al.2212.05510:mortar_board:None
2022-12-09Occluded Person Re-Identification via Relational Adaptive Feature Correction LearningMinjung Kim et.al.2212.04712:mortar_board:None
2022-12-08Steady-State Ultracold PlasmaB. B. Zelener et.al.2212.04389:mortar_board:None
2022-12-08Complete Solution for Vehicle Re-ID in Surround-view Camera SystemZizhang Wu et.al.2212.04126:mortar_board:None
2022-12-07Multiple Object Tracking Challenge Technical Report for Team MT_IoTFeng Yan et.al.2212.03586:mortar_board:None
2022-12-06Sparse Message Passing Network with Feature Integration for Online Multiple Object TrackingBisheng Wang et.al.2212.02992:mortar_board:None
2022-12-05Generalizable Person Re-Identification via Viewpoint Alignment and FusionBingliang Jiao et.al.2212.02398:mortar_board:None
2022-12-03Generalizing Multiple Object Tracking to Unseen Domains by Introducing Natural Language RepresentationEn Yu et.al.2212.01568:mortar_board:None
2022-12-02CC-3DT: Panoramic 3D Object Tracking via Cross-Camera FusionTobias Fischer et.al.2212.01247:mortar_board:None
2022-12-01Privacy-Preserving Data Synthetisation for Secure Information SharingTânia Carvalho et.al.2212.00484:mortar_board:None
2022-12-01Learning Progressive Modality-shared Transformers for Effective Visible-Infrared Person Re-identificationHu Lu et.al.2212.00226:mortar_board:Code
2022-11-30Neighbour Consistency Guided Pseudo-Label Refinement for Unsupervised Person Re-IdentificationDe Cheng et.al.2211.16847:mortar_board:None
2022-11-29Lifelong Person Re-Identification via Knowledge Refreshing and ConsolidationChunlin Yu et.al.2211.16201:mortar_board:Code
2022-11-29Similarity Distribution based Membership Inference Attack on Person Re-identificationJunyao Gao et.al.2211.15918:mortar_board:None
2022-11-27Dynamic Feature Pruning and Consolidation for Occluded Person Re-IdentificationYuteng Ye et.al.2211.14742:mortar_board:None
2022-11-24Hard to Track Objects with Irregular Motions and Similar Appearances? Make It Easier by Buffering the Matching SpaceFan Yang et.al.2211.14317:mortar_board:None
2022-11-25CLIP-ReID: Exploiting Vision-Language Model for Image Re-Identification without Concrete Text LabelsSiyuan Li et.al.2211.13977:mortar_board:Code
2022-11-24ReFace: Improving Clothes-Changing Re-Identification With Face FeaturesDaniel Arkushin et.al.2211.13807:mortar_board:Code
2022-11-24Automated Driving Systems Data Acquisition and Processing PlatformXin Xia et.al.2211.13425:mortar_board:None
2022-11-22Transformer Based Multi-Grained Features for Unsupervised Person Re-IdentificationJiachen Li et.al.2211.12280:mortar_board:None
2022-11-22Multimodal Data Augmentation for Visual-Infrared Person ReID with Corrupted DataArthur Josi et.al.2211.11925:mortar_board:None
2022-11-22Confidence-guided Centroids for Unsupervised Person Re-IdentificationYunqi Miao et.al.2211.11921:mortar_board:None
2022-11-21A Benchmark of Video-Based Clothes-Changing Person Re-IdentificationLikai Wang et.al.2211.11165:mortar_board:None
2022-11-20A Unified Model for Tracking and Image-Video Detection Has More PowerPeirong Liu et.al.2211.11077:mortar_board:None
2022-11-20Invisible Backdoor Attack with Dynamic Triggers against Person Re-identificationWenli Sun et.al.2211.10933:mortar_board:None
2022-11-18SeaTurtleID: A novel long-span dataset highlighting the importance of timestamps in wildlife re-identificationKostas Papafitsoros et.al.2211.10307:mortar_board:None
2022-11-17MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object DetectorsYuang Zhang et.al.2211.09791:mortar_board:Code
2022-11-17Machine learning opens a doorway for microrheology with optical tweezers in living systemsMatthew G. Smith et.al.2211.09689:mortar_board:None
2022-11-17Multi-Camera Multi-Object Tracking on the Move via Single-Stage Global Association ApproachPha Nguyen et.al.2211.09663:mortar_board:None
2022-11-17Targeted Attention for Generalized- and Zero-Shot LearningAbhijit Suprem et.al.2211.09322:mortar_board:None
2022-11-16Robust Online Video Instance Segmentation with Track QueriesZitong Zhan et.al.2211.09108:mortar_board:None
2022-11-16SMILEtrack: SiMIlarity LEarning for Multiple Object TrackingYu-Hsiang Wang et.al.2211.08824:mortar_board:None
2022-11-15Using Auxiliary Information for Person Re-Identification – A Tutorial OverviewTharindu Fernando et.al.2211.08565:mortar_board:None
2022-11-14SportsTrack: An Innovative Method for Tracking Athletes in Sports ScenesJie Wang et.al.2211.07173:mortar_board:Code
2022-11-13Learning from partially labeled data for multi-organ and tumor segmentationYutong Xie et.al.2211.06894:mortar_board:None
2022-11-12TAPAS: a Toolbox for Adversarial Privacy Auditing of Synthetic DataFlorimond Houssiau et.al.2211.06550:mortar_board:Code
2022-11-09Efficient Joint Detection and Multiple Object Tracking with Spatially Aware TransformerSiddharth Sagar Nijhawan et.al.2211.05654:mortar_board:None
2022-11-10HSGNet: Object Re-identification with Hierarchical Similarity Graph NetworkFei Shen et.al.2211.05486:mortar_board:None
2022-11-09MEVID: Multi-view Extended Videos with Identities for Video Person Re-IdentificationDaniel Davila et.al.2211.04656:mortar_board:None
2022-11-06Sequential Transformer for End-to-End Person SearchLong Chen et.al.2211.04323:mortar_board:None
2022-11-08ShaSTA: Modeling Shape and Spatio-Temporal Affinities for 3D Multi-Object TrackingTara Sadjadpour et.al.2211.03919:mortar_board:None
2022-11-07Body Part-Based Representation Learning for Occluded Person Re-IdentificationVladimir Somers et.al.2211.03679:mortar_board:None
2022-11-07Generalizable Re-Identification from Videos with Cycle AssociationZhongdao Wang et.al.2211.03663:mortar_board:None
2022-11-07Camera Alignment and Weighted Contrastive Learning for Domain Adaptation in Video Person ReIDDjebril Mekhazni et.al.2211.03626:mortar_board:None
2022-11-04Development and evaluation of automated localization and reconstruction of all fruits on tomato plants in a greenhouse based on multi-view perception and 3D multi-object trackingDavid Rapado Rincon et.al.2211.02760:mortar_board:None
2022-10-30PhysioGait: Context-Aware Physiological Context Modeling for Person Re-identification Attack on Wearable SensingJames O Sullivan et.al.2211.02622:mortar_board:None
2022-11-03Large Scale Real-World Multi-Person TrackingBing Shuai et.al.2211.02175:mortar_board:None
2022-11-03Privacy-preserving Deep Learning based Record LinkageThilina Ranbaduge et.al.2211.02161:mortar_board:None
2022-11-02Generation of Anonymous Chest Radiographs Using Latent Diffusion Models for Training Thoracic Abnormality Classification SystemsKai Packhäuser et.al.2211.01323:mortar_board:None
2022-11-02Deep Multimodal Fusion for Generalizable Person Re-identificationSuncheng Xiang et.al.2211.00933:mortar_board:Code
2022-10-29SearchTrack: Multiple Object Tracking with Object-Customized Search and Motion-Aware FeaturesZhong-Min Tsai et.al.2210.16572:mortar_board:Code
2022-10-26End-to-end Tracking with a Multi-query TransformerBruno Korbar et.al.2210.14601:mortar_board:None
2022-10-25Towards improved loading, cooling, and trapping of molecules in magneto-optical trapsThomas K. Langin et.al.2210.14223:mortar_board:None
2022-10-25Jet-Loaded Cold Atomic Beam Source for StrontiumMinho Kwon et.al.2210.14186:mortar_board:None
2022-10-25Fast loading of a cold mixture of Sodium and Potassium atoms from compact and versatile cold atomic beam sourcesSagar Sutradhar et.al.2210.14084:mortar_board:None
2022-10-25Unsupervised domain-adaptive person re-identification with multi-camera constraintsS. Takeuchi et.al.2210.13999:mortar_board:None
2022-10-24Strong-TransCenter: Improved Multi-Object Tracking based on Transformers with Dense RepresentationsAmit Galor et.al.2210.13570:mortar_board:Code
2022-10-23DMODE: Differential Monocular Object Distance Estimation Module without Class Specific InformationPedram Agand et.al.2210.12596:mortar_board:Code
2022-10-23A dichotomous behavior of Guttman-Kaiser criterion from equi-correlated normal populationYohji Akama et.al.2210.12580:mortar_board:None
2022-10-20End-to-End Context-Aided Unicity Matching for Person Re-identificationMin Cao et.al.2210.12008:mortar_board:None
2022-10-19RT-MOT: Confidence-Aware Real-Time Scheduling Framework for Multi-Object Tracking TasksDonghwa Kang et.al.2210.11946:mortar_board:None
2022-10-19RLM-Tracking: Online Multi-Pedestrian Tracking Supported by Relative Location MappingKai Ren et.al.2210.10477:mortar_board:None
2022-10-19Domain generalization Person Re-identification on Attention-aware multi-operation strategeryYingchun Guo et.al.2210.10409:mortar_board:None
2022-10-19CLIP-Driven Fine-grained Text-Image Person Re-identificationShuanglin Yan et.al.2210.10276:mortar_board:None
2022-10-18Optical Two-dimensional Coherent Spectroscopy of Cold AtomsDanfu Liang et.al.2210.10115:mortar_board:None
2022-10-18Risk of re-identification for shared clinical speech recordingsDaniela A. Wiepert et.al.2210.09975:mortar_board:None
2022-10-17Track Targets by Dense Spatio-Temporal Position EncodingJinkun Cao et.al.2210.09455:mortar_board:None
2022-10-17Joint Plasticity Learning for Camera Incremental Person Re-IdentificationZexian Yang et.al.2210.08710:mortar_board:None
2022-10-16AttTrack: Online Deep Attention Transfer for Multi-object TrackingKeivan Nalaie et.al.2210.08648:mortar_board:None
2022-10-16Data-Model-Circuit Tri-Design for Ultra-Light Video Intelligence on Edge DevicesYimeng Zhang et.al.2210.08578:mortar_board:None
2022-10-14Quo Vadis: Is Trajectory Forecasting the Key Towards Long-Term Multi-Object Tracking?Patrick Dendorfer et.al.2210.07681:mortar_board:None
2022-10-12QDTrack: Quasi-Dense Similarity Learning for Appearance-Only Multiple Object TrackingTobias Fischer et.al.2210.06984:mortar_board:None
2022-10-11Parallel Augmentation and Dual Enhancement for Occluded Person Re-identificationZi wang et.al.2210.05438:mortar_board:None
2022-10-11EnsembleMOT: A Step towards Ensemble Learning of Multiple Object TrackingYunhao Du et.al.2210.05278:mortar_board:Code
2022-10-07Specialized Re-Ranking: A Novel Retrieval-Verification Framework for Cloth Changing Person Re-IdentificationRenjie Zhang et.al.2210.03592:mortar_board:None
2022-10-07PS-ARM: An End-to-End Attention-aware Relation Mixer Network for Person SearchMustansar Fiaz et.al.2210.03433:mortar_board:Code
2022-10-07Multiple Object Tracking from appearance by hierarchically clustering trackletsAndreu Girbau et.al.2210.03355:mortar_board:Code
2022-10-07Dual Clustering Co-teaching with Consistent Sample Mining for Unsupervised Person Re-IdentificationZeqi Chen et.al.2210.03339:mortar_board:None
2022-10-05SoccerNet 2022 Challenges ResultsSilvio Giancola et.al.2210.02365:mortar_board:None
2022-10-05MOTSLAM: MOT-assisted monocular dynamic SLAM using single-view depth estimationHanwei Zhang et.al.2210.02038:mortar_board:None
2022-10-04Positive Pair Distillation Considered Harmful: Continual Meta Metric Learning for Lifelong Object Re-IdentificationKai Wang et.al.2210.01600:mortar_board:Code
2022-10-04How Image Generation Helps Visible-to-Infrared Person Re-Identification?Honghu Pan et.al.2210.01585:mortar_board:None
2022-10-04FRIDA: Fisheye Re-Identification Dataset with AnnotationsMertcan Cokbas et.al.2210.01582:mortar_board:None
2022-10-03Interpretable Deep TrackingBenjamin Thérien et.al.2210.01266:mortar_board:None
2022-09-30Robust Person Identification: A WiFi Vision-based ApproachYili Ren et.al.2210.00127:mortar_board:None
2022-09-30Transformers for Object Detection in Large Point CloudsFelicia Ruppel et.al.2209.15258:mortar_board:None
2022-09-30Physical Adversarial Attack meets Computer Vision: A Decade SurveyHui Wei et.al.2209.15179:mortar_board:Code
2022-09-29DirectTracker: 3D Multi-Object Tracking Using Direct Image Alignment and Photometric Bundle AdjustmentMariia Gladkova et.al.2209.14965:mortar_board:None
2022-09-27Observation Centric and Central Distance Recovery on Sports Player TrackingHsiang-Wei Huang et.al.2209.13154:mortar_board:None
2022-09-25D3^{\bf{3}}: Duplicate Detection Decontaminator for Multi-Athlete Tracking in Sports VideosRui He et.al.2209.12248:mortar_board:Code
2022-09-25BURST: A Benchmark for Unifying Object Recognition, Segmentation and Tracking in VideoAli Athar et.al.2209.12118:mortar_board:Code
2022-09-24Super-resolution atomic microscopy using orbit angular momentum laser with temporal modulationYuan Liu et.al.2209.11917:mortar_board:None
2022-09-23Multi-Granularity Graph Pooling for Video-based Person Re-IdentificationHonghu Pan et.al.2209.11584:mortar_board:None
2022-09-23Pose-Aided Video-based Person Re-Identification via Recurrent Graph Convolutional NetworkHonghu Pan et.al.2209.11582:mortar_board:None
2022-09-23Deep Learning-based Anonymization of Chest Radiographs: A Utility-preserving Measure for Patient PrivacyKai Packhäuser et.al.2209.11531:mortar_board:None
2022-09-23Grouped Adaptive Loss Weighting for Person SearchYanling Tian et.al.2209.11492:mortar_board:None
2022-09-23Towards Frame Rate Agnostic Multi-Object TrackingWeitao Feng et.al.2209.11404:mortar_board:Code
2022-09-23Horizon area bound and MOTS stability in locally rotationally symmetric solutionsAbbas M. Sherif et.al.2209.11358:mortar_board:None
2022-09-20Sampling Agnostic Feature Representation for Long-Term Person Re-identificationSeongyeop Yang et.al.2209.09574:mortar_board:Code
2022-09-19Visible-Infrared Person Re-Identification Using Privileged Intermediate InformationMahdi Alehdaghi et.al.2209.09348:mortar_board:Code
2022-09-19Uncertainty Aware Multitask Pyramid Vision Transformer For UAV-Based Object Re-IdentificationSyeda Nyma Ferdous et.al.2209.08686:mortar_board:None
2022-09-18RVSL: Robust Vehicle Similarity Learning in Real Hazy Scenes Based on Semi-supervised LearningWei-Ting Chen et.al.2209.08630:mortar_board:Code
2022-09-18Bi-color atomic beam slower and magnetic field compensation for ultracold gasesJianing Li et.al.2209.08479:mortar_board:None
2022-09-14TrADe Re-ID – Live Person Re-Identification using Tracking and Anomaly DetectionLuigy Machaca et.al.2209.06452:mortar_board:None
2022-09-12Style Variable and Irrelevant Learning for Generalizable Person Re-identificationHaobo Chen et.al.2209.05235:mortar_board:Code
2022-09-12Is Synthetic Dataset Reliable for Benchmarking Generalizable Person Re-Identification?Cuicui Kang et.al.2209.05047:mortar_board:None
2022-09-11Local-Aware Global Attention Network for Person Re-IdentificationNathanael L. Baisa et.al.2209.04821:mortar_board:None
2022-09-11Multiple Object Tracking in Recent Times: A Literature ReviewMk Bashar et.al.2209.04796:mortar_board:None
2022-09-08PixTrack: Precise 6DoF Object Pose Tracking using NeRF Templates and Feature-metric AlignmentPrajwal Chidananda et.al.2209.03910:mortar_board:None
2022-09-06CAMO-MOT: Combined Appearance-Motion Optimization for 3D Multi-Object Tracking with Camera-LiDAR FusionLi Wang et.al.2209.02540:mortar_board:None
2022-09-04On the Risks of Collecting Multidimensional Data Under Local Differential PrivacyHéber H. Arcolezi et.al.2209.01684:mortar_board:Code
2022-09-01Which anonymization technique is best for which NLP task? – It depends. A Systematic Study on Clinical Text ProcessingIyadh Ben Cheikh Larbi et.al.2209.00262:mortar_board:None
2022-08-30The Athena X-ray Integral Field Unit: a consolidated design for the system requirement review of the preliminary definition phaseDidier Barret et.al.2208.14562:mortar_board:None
2022-08-30Synthehicle: Multi-Vehicle Multi-Camera Tracking in Virtual CitiesFabian Herzog et.al.2208.14167:mortar_board:Code
2022-08-27Actor-identified Spatiotemporal Action Detection – Detecting Who Is Doing What in VideosFan Yang et.al.2208.12940:mortar_board:Code
2022-08-25Identity-Sensitive Knowledge Propagation for Cloth-Changing Person Re-identificationJianbing Wu et.al.2208.12023:mortar_board:Code
2022-08-25Skeleton Prototype Contrastive Learning with Multi-Level Graph Relation Modeling for Unsupervised Person Re-IdentificationHaocong Rao et.al.2208.11814:mortar_board:Code
2022-08-24Dynamic Template Initialization for Part-Aware Person Re-IDKalana Abeywardena et.al.2208.11440:mortar_board:None
2022-08-23Quality Matters: Embracing Quality Clues for Robust 3D Multi-Object TrackingJinrong Yang et.al.2208.10976:mortar_board:None
2022-08-22Information-Theoretic Equivalence of Entropic Multi-Marginal Optimal Transport: A Theory for Multi-Agent CommunicationShuchan Wang et.al.2208.10256:mortar_board:None
2022-08-22Minkowski Tracker: A Sparse Spatio-Temporal R-CNN for Joint Object Detection and TrackingJunYoung Gwak et.al.2208.10056:mortar_board:None
2022-08-21CycleTrans: Learning Neutral yet Discriminative Features for Visible-Infrared Person Re-IdentificationQiong Wu et.al.2208.09844:mortar_board:None
2022-08-19Synthetic Data in Human Analysis: A SurveyIndu Joshi et.al.2208.09191:mortar_board:None
2022-08-18Domain Camera Adaptation and Collaborative Multiple Feature Clustering for Unsupervised Person Re-IDYuanpeng Tu et.al.2208.08624:mortar_board:None
2022-08-17DeepSportradar-v1: Computer Vision Dataset for Sports Understanding with High Quality AnnotationsGabriel Van Zandycke et.al.2208.08190:mortar_board:Code
2022-08-17InterTrack: Interaction Transformer for 3D Multi-Object TrackingJohn Willes et.al.2208.08041:mortar_board:None
2022-08-13Enhanced Vehicle Re-identification for ITS: A Feature Fusion approach using Deep LearningAshutosh Holla B et.al.2208.06579:mortar_board:None
2022-08-09Privacy-Aware Adversarial Network in Human Mobility PredictionYuting Zhan et.al.2208.05009:mortar_board:None
2022-08-08Occlusion-Aware Instance Segmentation via BiLayer Network ArchitecturesLei Ke et.al.2208.04438:mortar_board:Code
2022-08-07Robust Multi-Object Tracking by Marginal InferenceYifu Zhang et.al.2208.03727:mortar_board:None
2022-08-06Transformer-based assignment decision network for multiple object trackingAthena Psalta et.al.2208.03571:mortar_board:Code
2022-08-05Accelerating the Sinkhorn algorithm for sparse multi-marginal optimal transport by fast Fourier transformsFatima Antarou Ba et.al.2208.03120:mortar_board:Code
2022-08-04SOMPT22: A Surveillance Oriented Multi-Pedestrian Tracking DatasetFatih Emre Simsek et.al.2208.02580:mortar_board:None
2022-08-04Learning Modal-Invariant and Temporal-Memory for Video-based Visible-Infrared Person Re-IdentificationXinyu Lin et.al.2208.02450:mortar_board:Code
2022-08-03PolarMOT: How Far Can Geometric Relations Take Us in 3D Multi-Object Tracking?Aleksandr Kim et.al.2208.01957:mortar_board:None
2022-08-01Counterfactual Intervention Feature Transfer for Visible-Infrared Person Re-identificationXulin Li et.al.2208.00967:mortar_board:None
2022-08-01Multi-spectral Vehicle Re-identification with Cross-directional Consistency Network and a High-quality BenchmarkAihua Zheng et.al.2208.00632:mortar_board:None
2022-07-30Towards Privacy-Preserving, Real-Time and Lossless Feature MatchingQiang Meng et.al.2208.00214:mortar_board:None
2022-07-29A Transfer Learning-Based Approach to Marine Vessel Re-IdentificationGuangmiao Zeng et.al.2207.14500:mortar_board:None
2022-07-29Significant changes in EEG neural oscillations during different phases of three-dimensional multiple object tracking task (3D-MOT) imply different roles for attention and working memoryYannick Roy et.al.2207.14470:mortar_board:None
2022-07-29Deep Learning-based Occluded Person Re-identification: A SurveyYunjie Peng et.al.2207.14452:mortar_board:None
2022-07-28The One Where They Reconstructed 3D Humans and Environments in TV ShowsGeorgios Pavlakos et.al.2207.14279:mortar_board:None
2022-07-28Video Mask Transfiner for High-Quality Video Instance SegmentationLei Ke et.al.2207.14012:mortar_board:None
2022-07-27Look Closer to Your Enemy: Learning to Attack via Teacher-student MimickingMingejie Wang et.al.2207.13381:mortar_board:None
2022-07-27Portrait Interpretation and a BenchmarkYixuan Fan et.al.2207.13315:mortar_board:None
2022-07-26Tracking Every Thing in the WildSiyuan Li et.al.2207.12978:mortar_board:None
2022-07-26TransFiner: A Full-Scale Refinement Approach for Multiple Object TrackingBin Sun et.al.2207.12967:mortar_board:None
2022-07-25Video object tracking based on YOLOv7 and DeepSORTFeng Yang et.al.2207.12202:mortar_board:None
2022-07-25Domain Adaptive Person SearchJunjie Li et.al.2207.11898:mortar_board:Code
2022-07-24Spatial-Temporal Federated Learning for Lifelong Person Re-identification on Distributed EdgesLei Zhang et.al.2207.11759:mortar_board:Code
2022-07-24Learnable Privacy-Preserving Anonymization for Pedestrian ImagesJunwu Zhang et.al.2207.11677:mortar_board:Code
2022-07-22PieTrack: An MOT solution based on synthetic data training and self-supervised domain adaptationYirui Wang et.al.2207.11325:mortar_board:None
2022-07-21UFO: Unified Feature OptimizationTeng Xi et.al.2207.10341:mortar_board:None
2022-07-21OIMNet++: Prototypical Normalization and Localization-aware Learning for Person SearchSanghoon Lee et.al.2207.10320:mortar_board:None
2022-07-20MOTCOM: The Multi-Object Tracking Dataset Complexity MetricMalte Pedersen et.al.2207.10031:mortar_board:None
2022-07-20Negative Samples are at Large: Leveraging Hard-distance Elastic Loss for Re-identificationHyungtae Lee et.al.2207.09884:mortar_board:None
2022-07-19The Caltech Fish Counting Dataset: A Benchmark for Multiple-Object Tracking and CountingJustin Kay et.al.2207.09295:mortar_board:Code
2022-07-19Dynamic Prototype Mask for Occluded Person Re-IdentificationLei Tan et.al.2207.09046:mortar_board:Code
2022-07-18A Certifiable Security Patch for Object Tracking in Self-Driving Systems via Historical Deviation ModelingXudong Pan et.al.2207.08556:mortar_board:None
2022-07-18A Semantic-aware Attention and Visual Shielding Network for Cloth-changing Person Re-identificationZan Gao et.al.2207.08387:mortar_board:None
2022-07-16Cross Vision-RF Gait Re-identification with Low-cost RGB-D Cameras and mmWave RadarsDongjiang Cao et.al.2207.07896:mortar_board:None
2022-07-16Learning Granularity-Unified Representations for Text-to-Image Person Re-identificationZhiyin Shao et.al.2207.07802:mortar_board:None
2022-07-15Multi-Object Tracking and Segmentation via Neural Message PassingGuillem Braso et.al.2207.07454:mortar_board:Code
2022-07-15Towards Privacy-Preserving Person Re-identification via Person Identify ShiftShuguang Dou et.al.2207.07311:mortar_board:None
2022-07-14Towards Grand Unification of Object TrackingBin Yan et.al.2207.07078:mortar_board:Code
2022-07-13Rapid Person Re-Identification via Sub-space Consistency RegularizationQingze Yin et.al.2207.05933:mortar_board:None
2022-07-12SpOT: Spatiotemporal Modeling for 3D Object TrackingColton Stearns et.al.2207.05856:mortar_board:None
2022-07-12Dynamic Gradient Reactivation for Backward Compatible Person Re-identificationXiao Pan et.al.2207.05658:mortar_board:None
2022-07-12Tracking Objects as Pixel-wise DistributionsZelin Zhao et.al.2207.05518:mortar_board:Code
2022-07-10Depth Perspective-aware Multiple Object TrackingKha Gia Quach et.al.2207.04551:mortar_board:None
2022-07-08TGRMPT: A Head-Shoulder Aided Multi-Person Tracker and a New Large-Scale Dataset for Tour-Guide RobotWen Wang et.al.2207.03726:mortar_board:Code
2022-07-08Frequency-based Randomization for Guaranteeing Differential Privacy in Spatial TrajectoriesFengmei Jin et.al.2207.03722:mortar_board:None
2022-07-07Privacy-Preserving Synthetic Educational Data GenerationJill-Jênn Vie et.al.2207.03202:mortar_board:Code
2022-07-07Style Interleaved Learning for Generalizable Person Re-identificationWentao Tan et.al.2207.03132:mortar_board:None
2022-07-06Context Sensing Attention Network for Video-based Person Re-identificationKan Wang et.al.2207.02631:mortar_board:None
2022-07-06Unsupervised Learning for Human Sensing Using Radio SignalsTianhong Li et.al.2207.02370:mortar_board:None
2022-07-05Video-based Surgical Skills Assessment using Long term Tool TrackingMona Fathollahi et.al.2207.02247:mortar_board:None
2022-07-04Adversarial Pairwise Reverse Attention for Camera Performance Imbalance in Person Re-identification: New Dataset and MetricsEugene P. W. Ang et.al.2207.01204:mortar_board:None
2022-06-29BoT-SORT: Robust Associations Multi-Pedestrian TrackingNir Aharon et.al.2206.14651:mortar_board:Code
2022-06-29SRCN3D: Sparse R-CNN 3D Surround-View Camera Object Detection and Tracking for Autonomous DrivingYining Shi et.al.2206.14451:mortar_board:Code
2022-06-283D Multi-Object Tracking with Differentiable Pose EstimationDominik Schmauser et.al.2206.13785:mortar_board:None
2022-06-27A compact setup for loading magneto-optical trap in ultrahigh vacuum environmentKavish Bharadwaj et.al.2206.13271:mortar_board:None
2022-06-23Learning Towards the Largest MarginsXiong Zhou et.al.2206.11589:mortar_board:None
2022-06-21GNN-PMB: A Simple but Effective Online 3D Multi-Object Tracker without Bells and WhistlesJianan Liu et.al.2206.10255:mortar_board:Code
2022-06-19mvHOTA: A multi-view higher order tracking accuracy metric to measure spatial and temporal associations in multi-point detectionLalith Sharan et.al.2206.09372:mortar_board:None
2022-06-19Towards Generalizable Person Re-identification with a Bi-stream Generative ModelXin Xu et.al.2206.09362:mortar_board:None
2022-06-14Plug-and-Play Pseudo Label Correction Network for Unsupervised Person Re-identificationTianyi Yan et.al.2206.06607:mortar_board:None
2022-06-13A novel reconstruction attack on foreign-trade official statistics, with a Brazilian case studyDanilo Fabrino Favato et.al.2206.06493:mortar_board:None
2022-06-10An Image Processing Pipeline for Camera Trap Time-Lapse RecordingsMichael L. Hilton et.al.2206.05159:mortar_board:Code
2022-06-09Simple Cues Lead to a Strong Multi-Object TrackerJenny Seidenschwarz et.al.2206.04656:mortar_board:None
2022-06-09Cross-modal Local Shortest Path and Global Enhancement for Visible-Thermal Person Re-IdentificationXiaohong Wang et.al.2206.04401:mortar_board:None
2022-06-08Depth Estimation Matters Most: Improving Per-Object Depth Estimation for Monocular 3D Detection and TrackingLonglong Jing et.al.2206.03666:mortar_board:None
2022-06-06NORPPA: NOvel Ringed seal re-identification by Pelage Pattern AggregationEkaterina Nepovinnykh et.al.2206.02498:mortar_board:Code
2022-06-06Sports Re-ID: Improving Re-Identification Of Players In Broadcast Videos Of Team SportsBharath Comandur et.al.2206.02373:mortar_board:None
2022-06-05Towards Individual Grevy’s Zebra Identification via Deep 3D Fitting and Metric LearningMaria Stennett et.al.2206.02261:mortar_board:None
2022-06-05SealID: Saimaa ringed seal re-identification datasetEkaterina Nepovinnykh et.al.2206.02260:mortar_board:None

VIT

Publish DateTitleAuthorsarxivPDFCode
2023-10-20Feature Selection and Hyperparameter Fine-tuning in Artificial Neural Networks for Wood Quality ClassificationMateus Roder et.al.2310.13490:mortar_board:None
2023-10-12UniPose: Detecting Any KeypointsJie Yang et.al.2310.08530:mortar_board:Code
2023-10-10l-dyno: framework to learn consistent visual features using robot’s motionKartikeya Singh et.al.2310.06249:mortar_board:None
2023-10-08Language-driven Open-Vocabulary Keypoint Detection for Animal Body and FaceHao Zhang et.al.2310.05056:mortar_board:None
2023-10-02H-InDex: Visual Reinforcement Learning with Hand-Informed Representations for Dexterous ManipulationYanjie Ze et.al.2310.01404:mortar_board:Code
2023-10-01Self-supervised Learning of Contextualized Local Visual EmbeddingsThalles Santos Silva et.al.2310.00527:mortar_board:Code
2023-09-26ObVi-SLAM: Long-Term Object-Visual SLAMAmanda Adkins et.al.2309.15268:mortar_board:Code
2023-09-19LiDAR-Generated Images Derived Keypoints Assisted Point Cloud Registration Scheme in Odometry EstimationHaizhou Zhang et.al.2309.10436:mortar_board:Code
2023-09-18RIDE: Self-Supervised Learning of Rotation-Equivariant Keypoint Detection and Invariant Description for EndoscopyMert Asim Karaoglu et.al.2309.09563:mortar_board:None
2023-09-17CryoAlign: feature-based method for global and local 3D alignment of EM density mapsBintao He et.al.2309.09217:mortar_board:None
2023-09-14EP2P-Loc: End-to-End 3D Point to 2D Pixel Localization for Large-Scale Visual LocalizationMinjung Kim et.al.2309.07471:mortar_board:Code
2023-09-09Mirror-Aware Neural HumansDaniel Ajisafe et.al.2309.04750:mortar_board:None
2023-09-07InstructDiffusion: A Generalist Modeling Interface for Vision TasksZigang Geng et.al.2309.03895:mortar_board:None
2023-09-04SKoPe3D: A Synthetic Dataset for Vehicle Keypoint Perception in 3D from Traffic Monitoring CamerasHimanshu Pahadia et.al.2309.01324:mortar_board:None
2023-09-01Improving the matching of deformable objects by learning to detect keypointsFelipe Cadar et.al.2309.00434:mortar_board:Code
2023-08-31SportsSloMo: A New Benchmark and Baselines for Human-centric Video Frame InterpolationJiaben Chen et.al.2308.16876:mortar_board:None
2023-08-30Learning Structure-from-Motion with Graph Attention NetworksLucas Brynte et.al.2308.15984:mortar_board:None
2023-08-29A lightweight 3D dense facial landmark estimation model from position map dataShubhajit Basak et.al.2308.15170:mortar_board:None
2023-08-27Automatic coarse co-registration of point clouds from diverse scan geometries: a test of detectors and descriptorsFrancesco Pirotti et.al.2308.14047:mortar_board:None
2023-08-24VNI-Net: Vector Neurons-based Rotation-Invariant Descriptor for LiDAR Place RecognitionGengxuan Tian et.al.2308.12870:mortar_board:None
2023-08-22LDP-Feat: Image Features with Local Differential PrivacyFrancesco Pittaluga et.al.2308.11223:mortar_board:None
2023-08-20Neural Interactive Keypoint DetectionJie Yang et.al.2308.10174:mortar_board:Code
2023-08-19ClothesNet: An Information-Rich 3D Garment Model Repository with Simulated Clothes EnvironmentBingyang Zhou et.al.2308.09987:mortar_board:None
2023-08-16DeDoDe: Detect, Don’t Describe – Describe, Don’t Detect for Local Feature MatchingJohan Edstedt et.al.2308.08479:mortar_board:Code
2023-08-15CoDeF: Content Deformation Fields for Temporally Consistent Video ProcessingHao Ouyang et.al.2308.07926:mortar_board:Code
2023-08-15ChartDETR: A Multi-shape Detection Network for Visual Chart RecognitionWenyuan Xue et.al.2308.07743:mortar_board:None
2023-08-14DELO: Deep Evidential LiDAR Odometry using Partial Optimal TransportSk Aziz Ali et.al.2308.07153:mortar_board:None
2023-08-102D3D-MATR: 2D-3D Matching Transformer for Detection-free Registration between Images and Point CloudsMinhao Li et.al.2308.05667:mortar_board:None
2023-07-29Automated Hit-frame Detection for Badminton Match AnalysisYu-Hang Chien et.al.2307.16000:mortar_board:Code
2023-07-25Mini-PointNetPlus: a local feature descriptor in deep learning model for 3d environment perceptionChuanyu Luo et.al.2307.13300:mortar_board:None
2023-07-20Reverse Knowledge Distillation: Training a Large Model using a Small One for Retinal Image Matching on Limited DataSahar Almahfouz Nasser et.al.2307.10698:mortar_board:Code
2023-07-19SAMConvex: Fast Discrete Optimization for CT Registration using Self-supervised Anatomical Embedding and Correlation PyramidZi Li et.al.2307.09727:mortar_board:None
2023-07-01SyMFM6D: Symmetry-aware Multi-directional Fusion for Multi-View 6D Object Pose EstimationFabian Duffhauss et.al.2307.00306:mortar_board:Code
2023-06-27Detector-Free Structure from MotionXingyi He et.al.2306.15669:mortar_board:Code
2023-06-26CLERA: A Unified Model for Joint Cognitive Load and Eye Region Analysis in the WildLi Ding et.al.2306.15073:mortar_board:None
2023-06-12Topology Repairing of Disconnected Pulmonary Airways and Vessels: Baselines and a DatasetZiqiao Weng et.al.2306.07089:mortar_board:Code
2023-06-07Learning Probabilistic Coordinate Fields for Robust CorrespondencesWeiyue Zhao et.al.2306.04231:mortar_board:None
2023-06-03LDEB – Label Digitization with Emotion Binarization and Machine Learning for Emotion Recognition in Conversational DialoguesAmitabha Dey et.al.2306.02193:mortar_board:None
2023-06-02Self-supervised Interest Point Detection and Description for Fisheye and Perspective ImagesMarcela Mera-Trujillo et.al.2306.01938:mortar_board:None
2023-06-01A Probabilistic Relaxation of the Two-Stage Object Pose Estimation ParadigmOnur Beker et.al.2306.00892:mortar_board:None
2023-05-30Align, Perturb and Decouple: Toward Better Leverage of Difference Information for RSI Change DetectionSupeng Wang et.al.2305.18714:mortar_board:Code
2023-05-23Diffusion Hyperfeatures: Searching Through Time and Space for Semantic CorrespondenceGrace Luo et.al.2305.14334:mortar_board:None
2023-05-15Non-Separable Multi-Dimensional Network Flows for Visual ComputingViktoria Ehm et.al.2305.08628:mortar_board:None
2023-05-13Illumination-insensitive Binary Descriptor for Visual Measurement Based on Local Inter-patch InvarianceXinyu Lin et.al.2305.07943:mortar_board:Code
2023-05-05HD2Reg: Hierarchical Descriptors and Detectors for Point Cloud RegistrationCanhui Tang et.al.2305.03487:mortar_board:Code
2023-04-17Human Pose Estimation in Monocular Omnidirectional Top-View ImagesJingrui Yu et.al.2304.08186:mortar_board:None
2023-04-14CoPR: Towards Accurate Visual Localization With Continuous Place-descriptor RegressionMubariz Zaffar et.al.2304.07426:mortar_board:None
2023-04-12SiLK – Simple Learned KeypointsPierre Gleize et.al.2304.06194:mortar_board:Code
2023-04-06From Saliency to DINO: Saliency-guided Vision Transformer for Few-shot Keypoint DetectionChangsheng Lu et.al.2304.03140:mortar_board:None
2023-03-29NerVE: Neural Volumetric Edges for Parametric Curve Extraction from Point CloudXiangyu Zhu et.al.2303.16465:mortar_board:None
2023-03-24PanoVPR: Towards Unified Perspective-to-Equirectangular Visual Place Recognition via Sliding Windows across the Panoramic ViewZe Shi et.al.2303.14095:mortar_board:Code
2023-03-23Semantic Image Attack for Visual Model DiagnosisJinqi Luo et.al.2303.13010:mortar_board:None
2023-03-22Object Pose Estimation with Statistical Guarantees: Conformal Keypoint Detection and Geometric Uncertainty PropagationHeng Yang et.al.2303.12246:mortar_board:None
2023-03-19RN-Net: Reservoir Nodes-Enabled Neuromorphic Vision Sensing NetworkSangmin Yoo et.al.2303.10770:mortar_board:None
2023-03-17ShaRPy: Shape Reconstruction and Hand Pose Estimation from RGB-D with UncertaintyVanessa Wirth et.al.2303.10042:mortar_board:None
2023-03-15Descriptor Distillation for Efficient Multi-Robot SLAMXiyue Guo et.al.2303.08420:mortar_board:None
2023-03-15From Local Binary Patterns to Pixel Difference Networks for Efficient Visual Representation LearningZhuo Su et.al.2303.08414:mortar_board:None
2023-03-09KGNv2: Separating Scale and Pose Prediction for Keypoint-based 6-DoF Grasp Synthesis on RGB-D inputYiye Chen et.al.2303.05617:mortar_board:None
2023-03-07External Camera-based Mobile Robot Pose Estimation for Collaborative Perception with Smart Edge SensorsSimon Bultmann et.al.2303.03797:mortar_board:None
2023-02-26PaRK-Detect: Towards Efficient Multi-Task Satellite Imagery Road Extraction via Patch-Wise Keypoints DetectionShenwei Xie et.al.2302.13263:mortar_board:None
2023-02-24Hybrid machine-learned homogenization: Bayesian data mining and convolutional neural networksJulian Lißner et.al.2302.12545:mortar_board:None
2023-02-21Deep Reinforcement Learning Based on Local GNN for Goal-conditioned Deformable Object RearrangingYuhong Deng et.al.2302.10446:mortar_board:None
2023-02-12A Correct-and-Certify Approach to Self-Supervise Object Pose Estimators via Ensemble Self-TrainingJingnan Shi et.al.2302.06019:mortar_board:None
2023-02-11Rethinking Vision Transformer and Masked Autoencoder in Multimodal Face Anti-SpoofingZitong Yu et.al.2302.05744:mortar_board:None
2023-02-09MAPS: A Noise-Robust Progressive Learning Approach for Source-Free Domain Adaptive Keypoint DetectionYuhe Ding et.al.2302.04589:mortar_board:None
2023-02-03Explicit Box Detection Unifies End-to-End Multi-Person Pose EstimationJie Yang et.al.2302.01593:mortar_board:Code
2023-02-03Simple, Effective and General: A New Backbone for Cross-view Image Geo-localizationYingying Zhu et.al.2302.01572:mortar_board:Code
2023-01-21Vision Aided Environment Semantics Extraction and Its Application in mmWave Beam SelectionFeiyang Wen et.al.2301.08973:mortar_board:None
2023-01-18OnePose++: Keypoint-Free One-Shot Object Pose Estimation without CAD ModelsXingyi He et.al.2301.07673:mortar_board:None
2023-01-12Towards High Performance One-Stage Human Pose EstimationLing Li et.al.2301.04842:mortar_board:None
2022-12-31Rethinking Rotation Invariance with Point Cloud RegistrationJianhui Yu et.al.2301.00149:mortar_board:None
2022-12-29Fruit Ripeness Classification: a SurveyMatteo Rizzo et.al.2212.14441:mortar_board:None
2022-12-28NeMo: 3D Neural Motion Fields from Multiple Video Instances of the Same ActionKuan-Chieh Wang et.al.2212.13660:mortar_board:None
2022-12-24HandsOff: Labeled Dataset Generation With No Additional Human AnnotationsAustin Xu et.al.2212.12645:mortar_board:None
2022-12-13Learning to Detect Good Keypoints to Match Non-Rigid Objects in RGB ImagesWelerson Melo et.al.2212.09589:mortar_board:None
2022-12-15Learning Markerless Robot-Depth Camera Calibration and End-Effector Pose EstimationBugra C. Sefercik et.al.2212.07567:mortar_board:None
2022-12-08DDM-NET: End-to-end learning of keypoint feature Detection, Description and Matching for 3D localizationXiangyu Xu et.al.2212.04575:mortar_board:None
2022-12-07ViTPose+: Vision Transformer Foundation Model for Generic Body Pose EstimationYufei Xu et.al.2212.04246:mortar_board:Code
2022-12-07Designing Feature Vector Representations: A case study from ChemistrySigne Sidwall Thygesen et.al.2212.03731:mortar_board:None
2022-12-06DiffuPose: Monocular 3D Human Pose Estimation via Denoising Diffusion Probabilistic ModelJeongjun Choi et.al.2212.02796:mortar_board:None
2022-12-05Images Speak in Images: A Generalist Painter for In-Context Visual LearningXinlong Wang et.al.2212.02499:mortar_board:Code
2022-12-05R2FD2: Fast and Robust Matching of Multimodal Remote Sensing Image via Repeatable Feature Detector and Rotation-invariant Feature DescriptorBai Zhu et.al.2212.02277:mortar_board:None
2022-11-28FeatureBooster: Boosting Feature Descriptors with a Lightweight Neural NetworkXinjiang Wang et.al.2211.15069:mortar_board:None
2022-11-27BALF: Simple and Efficient Blur Aware Local Feature DetectorZhenjun Zhao et.al.2211.14731:mortar_board:None
2022-11-21Conjugate Product Graphs for Globally Optimal 2D-3D Shape MatchingPaul Roetzer et.al.2211.11589:mortar_board:None
2022-11-07Learning Feature Descriptors for Pre- and Intra-operative Point Cloud Matching for Laparoscopic Liver RegistrationZixin Yang et.al.2211.03688:mortar_board:None
2022-10-31Tree Detection and Diameter Estimation Based on Deep LearningVincent Grondin et.al.2210.17424:mortar_board:Code
2022-10-26Learning a Task-specific Descriptor for Robust Matching of 3D Point CloudsZhiyuan Zhang et.al.2210.14899:mortar_board:None
2022-10-23Few-Shot Meta Learning for Recognizing Facial Phenotypes of Genetic DisordersÖmer Sümer et.al.2210.12705:mortar_board:None
2022-10-21Real-time Detection of 2D Tool Landmarks with Synthetic Training DataBram Vanherle et.al.2210.11991:mortar_board:None
2022-10-09Fusing Event-based Camera and Radar for SLAM Using Spiking Neural Networks with Continual STDP LearningAli Safa et.al.2210.04236:mortar_board:None
2022-10-04Centroid Distance Keypoint Detector for Colored Point CloudsHanzhe Teng et.al.2210.01298:mortar_board:Code
2022-09-28Category-Level Global Camera Pose Estimation with Multi-Hypothesis Point Cloud CorrespondencesJun-Jee Chao et.al.2209.14419:mortar_board:None
2022-09-28USEEK: Unsupervised SE(3)-Equivariant 3D Keypoints for Generalizable ManipulationZhengrong Xue et.al.2209.13864:mortar_board:None
2022-09-27Suture Thread Spline Reconstruction from Endoscopic Images for Robotic Surgery with Reliability-driven Keypoint DetectionNeelay Joglekar et.al.2209.13657:mortar_board:Code
2022-09-27Learning-Based Dimensionality Reduction for Computing Compact and Effective Local Feature DescriptorsHao Dong et.al.2209.13586:mortar_board:Code
2022-09-26Performance Evaluation of 3D Keypoint Detectors and Descriptors on Coloured Point Clouds in Subsea EnvironmentsKyungmin Jung et.al.2209.12881:mortar_board:None
2022-09-21Long-Lived Accurate Keypoints in Event StreamsPhilippe Chiberre et.al.2209.10385:mortar_board:None
2022-09-19Integrative Feature and Cost Aggregation with Transformers for Dense CorrespondenceSunghwan Hong et.al.2209.08742:mortar_board:None
2022-09-15Online Marker-free Extrinsic Camera Calibration using Person Keypoint DetectionsBastian Pätzold et.al.2209.07393:mortar_board:Code
2022-09-07Deep Learning-Based Automatic Diagnosis System for Developmental Dysplasia of the HipYang Li et.al.2209.03440:mortar_board:None
2022-08-27Learning to SLAM on the Fly in Unknown Environments: A Continual Learning Approach for Drones in Visually Ambiguous ScenesAli Safa et.al.2208.12997:mortar_board:None
2022-08-24Self-Supervised Endoscopic Image Key-Points MatchingManel Farhat et.al.2208.11424:mortar_board:Code
2022-08-17Blind-Spot Collision Detection System for Commercial Vehicles Using Multi Deep CNN ArchitectureMuhammad Muzammel et.al.2208.08224:mortar_board:None
2022-08-08MetaGraspNet: A Large-Scale Benchmark Dataset for Scene-Aware Ambidextrous Bin Picking via Physics-based Metaverse SynthesisMaximilian Gilles et.al.2208.03963:mortar_board:None
2022-08-07CVLNet: Cross-View Semantic Correspondence Learning for Video-based Camera LocalizationYujiao Shi et.al.2208.03660:mortar_board:None
2022-07-29Explicit Occlusion Reasoning for Multi-person 3D Human Pose EstimationQihao Liu et.al.2208.00090:mortar_board:None
2022-07-25Translating a Visual LEGO Manual to a Machine-Executable PlanRuocheng Wang et.al.2207.12572:mortar_board:None
2022-07-21Multi-modal Retinal Image Registration Using a Keypoint-Based Vessel Structure Aligning NetworkAline Sindel et.al.2207.10506:mortar_board:None
2022-07-15Human keypoint detection for close proximity human-robot interactionJan Docekal et.al.2207.07742:mortar_board:None
2022-07-15Adversarial Focal Loss: Asking Your Discriminator for Hard ExamplesChen Liu et.al.2207.07739:mortar_board:None
2022-07-13Rapid Person Re-Identification via Sub-space Consistency RegularizationQingze Yin et.al.2207.05933:mortar_board:None
2022-07-07RWT-SLAM: Robust Visual SLAM for Highly Weak-textured EnvironmentsQihao Peng et.al.2207.03539:mortar_board:None
2022-07-06Semi-supervised Human Pose Estimation in Art-historical ImagesMatthias Springstein et.al.2207.02976:mortar_board:None
2022-07-01Weakly-supervised High-fidelity Ultrasound Video Synthesis with Feature DecouplingJiamin Liang et.al.2207.00474:mortar_board:None
2022-06-24Motion Estimation for Large Displacements and DeformationsQiao Chen et.al.2206.12464:mortar_board:None
2022-06-24Deep embedded clustering algorithm for clustering PACS repositoriesTeo Manojlović et.al.2206.12417:mortar_board:None
2022-06-21KTN: Knowledge Transfer Network for Learning Multi-person 2D-3D CorrespondencesXuanhan Wang et.al.2206.10090:mortar_board:Code
2022-06-20Self-Supervised Consistent Quantization for Fully Unsupervised Image RetrievalGuile Wu et.al.2206.09806:mortar_board:None
2022-06-15A Unified Sequence Interface for Vision TasksTing Chen et.al.2206.07669:mortar_board:None
2022-06-09Beyond RGB: Scene-Property Synthesis with Neural Radiance FieldsMingtong Zhang et.al.2206.04669:mortar_board:None
2022-06-03SNAKE: Shape-aware Neural 3D Keypoint FieldChengliang Zhong et.al.2206.01724:mortar_board:Code
2022-05-17MulT: An End-to-End Multitask Learning TransformerDeblina Bhattacharjee et.al.2205.08303:mortar_board:None
2022-05-10ConfLab: A Rich Multimodal Multisensor Dataset of Free-Standing Social Interactions In-the-WildChirag Raman et.al.2205.05177:mortar_board:None
2022-04-28Polarimetric imaging for the detection of synthetic models of SARS-CoV-2: a proof of conceptEmilio Gomez-Gonzalez et.al.2204.14050:mortar_board:None
2022-04-28GRIT: General Robust Image Task BenchmarkTanmay Gupta et.al.2204.13653:mortar_board:Code
2022-04-26ViTPose: Simple Vision Transformer Baselines for Human Pose EstimationYufei Xu et.al.2204.12484:mortar_board:Code
2022-04-26Unified GCNs: Towards Connecting GCNs with CNNsZiyan Zhang et.al.2204.12300:mortar_board:None
2022-04-19Self-Supervised Equivariant Learning for Oriented Keypoint DetectionJongmin Lee et.al.2204.08613:mortar_board:Code

3D Object Detection

Publish DateTitleAuthorsarxivPDFCode
2023-10-21Concept-based Anomaly Detection in Retail Stores for Automatic Correction using Mobile RobotsAditya Kapoor et.al.2310.14063:mortar_board:None
2023-10-21Ophthalmic Biomarker Detection Using Ensembled Vision Transformers – Winning Solution to IEEE SPS VIP Cup 2023H. A. Z. Sameen Shahgir et.al.2310.14005:mortar_board:None
2023-10-21Exploring Driving Behavior for Autonomous Vehicles Based on Gramian Angular Field Vision TransformerJunwei You et.al.2310.13906:mortar_board:None
2023-10-19A Car Model Identification System for Streamlining the Automobile Sales ProcessSaid Togru et.al.2310.13198:mortar_board:None
2023-10-19LeTFuser: Light-weight End-to-end Transformer-Based Sensor Fusion for Autonomous Driving with Multi-Task LearningPedram Agand et.al.2310.13135:mortar_board:Code
2023-10-18Tailoring Adversarial Attacks on Deep Neural Networks for Targeted Class Manipulation Using DeepFool AlgorithmS. M. Fazle Rabby Labib et.al.2310.13019:mortar_board:None
2023-10-19Predicting Ovarian Cancer Treatment Response in Histopathology using Hierarchical Vision Transformers and Multiple Instance LearningJack Breen et.al.2310.12866:mortar_board:Code
2023-10-19Model Merging by Uncertainty-Based Gradient MatchingNico Daheim et.al.2310.12808:mortar_board:None
2023-10-19Mixing Histopathology Prototypes into Robust Slide-Level Representations for Cancer SubtypingJoshua Butke et.al.2310.12769:mortar_board:Code
2023-10-19Minimalist and High-Performance Semantic Segmentation with Plain Vision TransformersYuanduo Hong et.al.2310.12755:mortar_board:Code
2023-10-19Heart Disease Detection using Vision-Based Transformer Models from ECG ImagesZeynep Hilal Kilimci et.al.2310.12630:mortar_board:None
2023-10-19Cross-attention Spatio-temporal Context Transformer for Semantic Segmentation of Historical MapsSidi Wu et.al.2310.12616:mortar_board:None
2023-10-16Interpreting and Controlling Vision Foundation Models via Text ExplanationsHaozhe Chen et.al.2310.10591:mortar_board:Code
2023-10-15Top-K Pooling with Patch Contrastive Learning for Weakly-Supervised Semantic SegmentationWangyu Wu et.al.2310.09828:mortar_board:None
2023-10-15MoEmo Vision Transformer: Integrating Cross-Attention and Movement Vectors in 3D Pose Estimation for HRI Emotion DetectionDavid C. Jeong et.al.2310.09757:mortar_board:None
2023-10-13Tackling Heterogeneity in Medical Federated learning via Vision TransformersErfan Darzi et.al.2310.09444:mortar_board:None
2023-10-13PaLI-3 Vision Language Models: Smaller, Faster, StrongerXi Chen et.al.2310.09199:mortar_board:None
2023-10-13Faster 3D cardiac CT segmentation with Vision TransformersLee Jollans et.al.2310.09099:mortar_board:Code
2023-10-12LEMON: Lossless model expansionYite Wang et.al.2310.07999:mortar_board:None
2023-10-113D TransUNet: Advancing Medical Image Segmentation through Vision TransformersJieneng Chen et.al.2310.07781:mortar_board:Code
2023-10-11Accelerating Vision Transformers Based on Heterogeneous Attention PatternsDeli Yu et.al.2310.07664:mortar_board:None
2023-10-11ProtoHPE: Prototype-guided High-frequency Patch Enhancement for Visible-Infrared Person Re-identificationGuiwei Zhang et.al.2310.07552:mortar_board:None
2023-10-11ViT-A: Legged Robot Path Planning using Vision Transformer A**Jianwei Liu et.al.2310.07525:mortar_board:None
2023-10-11PtychoDV: Vision Transformer-Based Deep Unrolling Network for Ptychographic Image ReconstructionWeijie Gan et.al.2310.07504:mortar_board:None
2023-10-11Distilling Efficient Vision Transformers from CNNs for Semantic SegmentationXu Zheng et.al.2310.07265:mortar_board:None
2023-10-10EViT: An Eagle Vision Transformer with Bi-Fovea Self-AttentionYulong Shi et.al.2310.06629:mortar_board:None
2023-10-10Machine Eye for Defects: Machine Learning-Based Solution to Identify and Characterize Topological Defects in Textured Images of Nematic MaterialsHaijie Ren et.al.2310.06406:mortar_board:None
2023-10-10Learning Stackable and Skippable LEGO Bricks for Efficient, Reconfigurable, and Variable-Resolution Diffusion ModelingHuangjie Zheng et.al.2310.06389:mortar_board:None
2023-10-10Efficient Adaptation of Large Vision Transformer via Adapter Re-ComposingWei Dong et.al.2310.06234:mortar_board:Code
2023-10-09DiPS: Discriminative Pseudo-Label Sampling with Self-Supervised Transformers for Weakly Supervised Object LocalizationShakeeb Murtaza et.al.2310.06196:mortar_board:Code
2023-10-09SimPLR: A Simple and Plain Transformer for Object Detection and SegmentationDuy-Kien Nguyen et.al.2310.05920:mortar_board:None
2023-10-09Transformer Fusion with Optimal TransportMoritz Imfeld et.al.2310.05719:mortar_board:None
2023-10-09ViTs are Everywhere: A Comprehensive Study Showcasing Vision Transformers in Different DomainMd Sohag Mia et.al.2310.05664:mortar_board:None
2023-10-09No Token Left Behind: Efficient Vision Transformer via Dynamic Token IdlingXuwei Xu et.al.2310.05654:mortar_board:None
2023-10-09Plug n’ Play: Channel Shuffle Module for Enhancing Tiny Vision TransformersXuwei Xu et.al.2310.05642:mortar_board:None
2023-10-09A Simple and Robust Framework for Cross-Modality Medical Image Segmentation applied to Vision TransformersMatteo Bastico et.al.2310.05572:mortar_board:Code
2023-10-09RetSeg: Retention-based Colorectal Polyps Segmentation NetworkKhaled ELKarazle et.al.2310.05446:mortar_board:None
2023-10-09Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision TransformersShiyue Cao et.al.2310.05400:mortar_board:None
2023-10-09Hierarchical Side-Tuning for Vision TransformersWeifeng Lin et.al.2310.05393:mortar_board:None
2023-10-08Low-Resolution Self-Attention for Semantic SegmentationYu-Huan Wu et.al.2310.05026:mortar_board:None
2023-10-06FedConv: Enhancing Convolutional Neural Networks for Handling Data Heterogeneity in Federated LearningPeiran Xu et.al.2310.04412:mortar_board:Code
2023-10-06TiC: Exploring Vision Transformer in ConvolutionSong Zhang et.al.2310.04134:mortar_board:Code
2023-10-06Sub-token ViT Embedding via Stochastic Resonance TransformersDong Lao et.al.2310.03967:mortar_board:None
2023-10-05ALBERTA: ALgorithm-Based Error Resilience in Transformer ArchitecturesHaoxuan Liu et.al.2310.03841:mortar_board:None
2023-10-05Exploring DINO: Emergent Properties and Limitations for Synthetic Aperture Radar ImageryJoseph A. Gallego-Mejia et.al.2310.03513:mortar_board:None
2023-10-05Swin-Tempo: Temporal-Aware Lung Nodule Detection in CT Scans as Video Sequences Using Swin Transformer-Enhanced UNetHossein Jafari et.al.2310.03365:mortar_board:None
2023-10-04Neural architecture impact on identifying temporally extended Reinforcement Learning tasksVictor Vadakechirayath George et.al.2310.03161:mortar_board:None
2023-10-04Reinforcement Learning-based Mixture of Vision Transformers for Video Violence RecognitionHamid Mohammadi et.al.2310.03108:mortar_board:None
2023-10-04Land-cover change detection using paired OpenStreetMap data and optical high-resolution imagery via object-guided TransformerHongruixuan Chen et.al.2310.02674:mortar_board:Code
2023-10-04GET: Group Event Transformer for Event-Based VisionYansong Peng et.al.2310.02642:mortar_board:Code
2023-10-04ViT-ReciproCAM: Gradient and Attention-Free Visual Explanations for Vision TransformerSeok-Yong Byun et.al.2310.02588:mortar_board:None
2023-10-04Improving Drumming Robot Via Attention Transformer NetworkYang Yi et.al.2310.02565:mortar_board:None
2023-10-04SlowFormer: Universal Adversarial Patch for Attack on Compute and Energy Efficiency of Inference Efficient Vision TransformersKL Navaneet et.al.2310.02544:mortar_board:Code
2023-10-03Selective Feature Adapter for Dense Vision TransformersXueqing Deng et.al.2310.01843:mortar_board:None
2023-10-03PPT: Token Pruning and Pooling for Efficient Vision TransformersXinjian Wu et.al.2310.01812:mortar_board:None
2023-10-02CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense PredictionSize Wu et.al.2310.01403:mortar_board:Code
2023-10-02Self-distilled Masked Attention guided masked image modeling with noise Regularized Teacher (SMART) for medical image analysisJue Jiang et.al.2310.01209:mortar_board:None
2023-10-01RegBN: Batch Normalization of Multimodal Data with RegularizationMorteza Ghahremani et.al.2310.00641:mortar_board:Code
2023-10-01Win-Win: Training High-Resolution Vision Transformers from Two WindowsVincent Leroy et.al.2310.00632:mortar_board:None
2023-09-30MVC: A Multi-Task Vision Transformer Network for COVID-19 Diagnosis from Chest X-ray ImagesHuyen Tran et.al.2310.00418:mortar_board:None
2023-09-30Distilling Inductive Bias: Knowledge Distillation Beyond Model CompressionGousia Habib et.al.2310.00369:mortar_board:None
2023-09-30Dual-Augmented Transformer Network for Weakly Supervised Semantic SegmentationJingliang Deng et.al.2310.00307:mortar_board:None
2023-09-29SMPLer-X: Scaling Up Expressive Human Pose and Shape EstimationZhongang Cai et.al.2309.17448:mortar_board:None
2023-09-28FLIP: Cross-domain Face Anti-spoofing with Language GuidanceKoushik Srivatsan et.al.2309.16649:mortar_board:Code
2023-09-28Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling LimitBlake Bordelon et.al.2309.16620:mortar_board:None
2023-09-28Vision Transformers Need RegistersTimothée Darcet et.al.2309.16588:mortar_board:None
2023-09-28HTC-DC Net: Monocular Height Estimation from Single Remote Sensing ImagesSining Chen et.al.2309.16486:mortar_board:Code
2023-09-28Channel Vision Transformers: An Image Is Worth C x 16 x 16 WordsYujia Bao et.al.2309.16108:mortar_board:None
2023-09-26GasMono: Geometry-Aided Self-Supervised Monocular Depth Estimation for Indoor ScenesChaoqiang Zhao et.al.2309.16019:mortar_board:None
2023-09-27CAIT: Triple-Win Compression towards High Accuracy, Fast Inference, and Favorable Transferability For ViTsAo Wang et.al.2309.15755:mortar_board:None
2023-09-27Improving Facade Parsing with Vision Transformers and Line IntegrationBowen Wang et.al.2309.15523:mortar_board:Code
2023-09-26Efficient Low-rank Backpropagation for Vision Transformer AdaptationYuedong Yang et.al.2309.15275:mortar_board:None
2023-09-25Assessment of IBM and NASA’s geospatial foundation model in flood inundation mappingWenwen Li et.al.2309.14500:mortar_board:None
2023-09-25Masked Image Residual Learning for Scaling Deeper Vision TransformersGuoxi Huang et.al.2309.14136:mortar_board:None
2023-09-25PARTICLE: Part Discovery and Contrastive Learning for Fine-grained RecognitionOindrila Saha et.al.2309.13822:mortar_board:Code
2023-09-24MOSAIC: Multi-Object Segmented Arbitrary Stylization Using CLIPPrajwal Ganugula et.al.2309.13716:mortar_board:None
2023-09-24Multi-Dimensional Hyena for Spatial Inductive BiasItamar Zimerman et.al.2309.13600:mortar_board:None
2023-09-24Global-correlated 3D-decoupling Transformer for Clothed Avatar ReconstructionZechuan Zhang et.al.2309.13524:mortar_board:Code
2023-09-23Beyond Grids: Exploring Elastic Input Sampling for Vision TransformersAdam Pardyl et.al.2309.13353:mortar_board:None
2023-09-23RBFormer: Improve Adversarial Robustness of Transformer by Robust BiasHao Cheng et.al.2309.13245:mortar_board:None
2023-09-22BayesDLL: Bayesian Deep Learning LibraryMinyoung Kim et.al.2309.12928:mortar_board:Code
2023-09-22Associative Transformer Is A Sparse Representation LearnerYuwei Sun et.al.2309.12862:mortar_board:None
2023-09-22Masking Improves Contrastive Self-Supervised Learning for ConvNets, and Saliency Tells You WhereZhi-Yi Chin et.al.2309.12757:mortar_board:None
2023-09-22Vision Transformers for Computer GoAmani Sagri et.al.2309.12675:mortar_board:None
2023-09-21DualToken-ViT: Position-aware Efficient Vision Transformer with Dual Token FusionZhenzhen Chu et.al.2309.12424:mortar_board:None
2023-09-21Adaptive Input-image Normalization for Solving Mode Collapse Problem in GAN-based X-ray ImagesMuhammad Muneeb Saad et.al.2309.12245:mortar_board:None
2023-09-21Bayesian sparsification for deep neural networks with Bayesian model reductionDimitrije Marković et.al.2309.12095:mortar_board:None
2023-09-21ZS6D: Zero-shot 6D Object Pose Estimation using Vision TransformersPhilipp Ausserlechner et.al.2309.11986:mortar_board:None
2023-09-20RMT: Retentive Networks Meet Vision TransformersQihang Fan et.al.2309.11523:mortar_board:None
2023-09-20Forgery-aware Adaptive Vision Transformer for Face Forgery DetectionAnwei Luo et.al.2309.11092:mortar_board:None
2023-09-19Interpret Vision Transformers as ConvNets with Dynamic ConvolutionsChong Zhou et.al.2309.10713:mortar_board:None
2023-09-19Latent Space Energy-based Model for Fine-grained Open Set RecognitionWentao Bao et.al.2309.10711:mortar_board:None
2023-09-19Self-Supervised Super-Resolution Approach for Isotropic Reconstruction of 3D Electron Microscopy Images from Anisotropic AcquisitionMohammad Khateri et.al.2309.10646:mortar_board:None
2023-09-19Exploring the Influence of Information Entropy Change in Learning SystemsXiaowei Yu et.al.2309.10625:mortar_board:None
2023-09-19LineMarkNet: Line Landmark Detection for Valet ParkingZizhang Wu et.al.2309.10475:mortar_board:None
2023-09-18TransientViT: A novel CNN - Vision Transformer hybrid real/bogus transient classifier for the Kilodegree Automatic Transient SurveyZhuoyang Chen et.al.2309.09937:mortar_board:Code
2023-09-18Selective Volume Mixup for Video Action RecognitionYi Tan et.al.2309.09534:mortar_board:None
2023-09-17MVP: Meta Visual Prompt Tuning for Few-Shot Remote Sensing Image Scene ClassificationJunjie Zhu et.al.2309.09276:mortar_board:None
2023-09-17Image-level supervision and self-training for transformer-based cross-modality tumor segmentationMalo de Boisredon et.al.2309.09246:mortar_board:None
2023-09-16MMST-ViT: Climate Change-aware Crop Yield Prediction via Multi-Modal Spatial-Temporal Vision TransformerFudong Lin et.al.2309.09067:mortar_board:Code
2023-09-16RingMo-lite: A Remote Sensing Multi-task Lightweight Network with CNN-Transformer Hybrid FrameworkYuelei Wang et.al.2309.09003:mortar_board:None
2023-09-15Biased Attention: Do Vision Transformers Amplify Gender Bias More than Convolutional Neural Networks?Abhishek Mandal et.al.2309.08760:mortar_board:Code
2023-09-15Replacing softmax with ReLU in Vision TransformersMitchell Wortsman et.al.2309.08586:mortar_board:None
2023-09-15SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient ChannelsHenry Hengyuan Zhao et.al.2309.08513:mortar_board:Code
2023-09-15Cross-Modal Synthesis of Structural MRI and Functional Connectivity Networks via Conditional ViT-GANsYuda Bi et.al.2309.08160:mortar_board:None
2023-09-15AnyOKP: One-Shot and Instance-Aware Object Keypoint Extraction with Pretrained ViTFangbo Qin et.al.2309.08134:mortar_board:None
2023-09-14Interpretability-Aware Vision TransformerYao Qiang et.al.2309.08035:mortar_board:Code
2023-09-13Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?Bill Psomas et.al.2309.06891:mortar_board:Code
2023-09-12Exploring Non-additive Randomness on ViT against Query-Based Black-Box AttacksJindong Gu et.al.2309.06438:mortar_board:None
2023-09-12Jersey Number Recognition using Keyframe Identification from Low-Resolution Broadcast VideosBavesh Balaji et.al.2309.06285:mortar_board:None
2023-09-12A 3M-Hybrid Model for the Restoration of Unique Giant Murals: A Case Study on the Murals of Yongle PalaceJing Yang et.al.2309.06194:mortar_board:None
2023-09-12Feature Aggregation Network for Building Extraction from High-resolution Remote Sensing ImagesXuan Zhou et.al.2309.06017:mortar_board:None
2023-09-11Mobile Vision Transformer-based Visual Object TrackingGoutam Yelluru Gopal et.al.2309.05829:mortar_board:Code
2023-09-11Divergences in Color Perception between Deep Neural Networks and HumansEthan O. Nadler et.al.2309.05809:mortar_board:None
2023-09-11CNN or ViT? Revisiting Vision Transformers Through the Lens of ConvolutionChenghao Li et.al.2309.05375:mortar_board:None
2023-09-10DeViT: Decomposing Vision Transformers for Collaborative Inference in Edge DevicesGuanyu Xu et.al.2309.05015:mortar_board:None
2023-09-09How to Evaluate Semantic Communications for Images with ViTScore Metric?Tingting Zhu et.al.2309.04891:mortar_board:None
2023-09-09Unified Language-Vision Pretraining with Dynamic Discrete Visual TokenizationYang Jin et.al.2309.04669:mortar_board:None
2023-09-09Video and Synthetic MRI Pre-training of 3D Vision Architectures for Neuroimage AnalysisNikhil J. Dhinagar et.al.2309.04651:mortar_board:None
2023-09-08Mobile V-MoEs: Scaling Down Vision Transformers via Sparse Mixture-of-ExpertsErik Daxberger et.al.2309.04354:mortar_board:None
2023-09-07S-Adapter: Generalizing Vision Transformer for Face Anti-Spoofing with Statistical TokensRizhao Cai et.al.2309.04038:mortar_board:None
2023-09-07DropPos: Pre-Training Vision Transformers by Reconstructing Dropped PositionsHaochen Wang et.al.2309.03576:mortar_board:None
2023-09-06Combining pre-trained Vision Transformers and CIDER for Out Of Domain DetectionGrégor Jouet et.al.2309.03047:mortar_board:None
2023-09-06Improving diagnosis and prognosis of lung cancer using vision transformers: A scoping reviewHazrat Ali et.al.2309.02783:mortar_board:None
2023-09-05Compressing Vision Transformers for Low-Resource Visual LearningEric Youn et.al.2309.02617:mortar_board:None
2023-09-05Domain Adaptation for Efficiently Fine-tuning Vision Transformer with Encrypted ImagesTeru Nagamori et.al.2309.02556:mortar_board:None
2023-09-05A survey on efficient vision transformers: algorithms, techniques, and performance benchmarkingLorenzo Papa et.al.2309.02031:mortar_board:None
2023-09-04Locality-Aware Hyperspectral ClassificationFangqin Zhou et.al.2309.01561:mortar_board:Code
2023-09-04Large Separable Kernel Attention: Rethinking the Large Kernel Attention Design in CNNKin Wai Lau et.al.2309.01439:mortar_board:None
2023-09-04DAT++: Spatially Dynamic Vision Transformer with Deformable AttentionZhuofan Xia et.al.2309.01430:mortar_board:Code
2023-09-04Leveraging Self-Supervised Vision Transformers for Neural Transfer Function DesignDominik Engel et.al.2309.01408:mortar_board:None
2023-09-04Semantic-Constraint Matching Transformer for Weakly Supervised Object LocalizationYiwen Cao et.al.2309.01331:mortar_board:None
2023-09-04ExMobileViT: Lightweight Classifier Extension for Mobile Vision TransformerGyeongdong Yang et.al.2309.01310:mortar_board:None
2023-09-02Contrastive Feature Masking Open-Vocabulary Vision TransformerDahun Kim et.al.2309.00775:mortar_board:None
2023-09-01Deep Joint Source-Channel Coding for Adaptive Image Transmission over MIMO ChannelsHaotian Wu et.al.2309.00470:mortar_board:None
2023-09-01Interpretable Medical Imagery Diagnosis with Self-Attentive Transformers: A Review of Explainable AI for Health CareTin Lai et.al.2309.00252:mortar_board:None
2023-08-31Beyond Self-Attention: Deformable Large Kernel Attention for Medical Image SegmentationReza Azad et.al.2309.00121:mortar_board:None
2023-08-31Laplacian-Former: Overcoming the Limitations of Vision Transformers in Local Texture DetectionReza Azad et.al.2309.00108:mortar_board:None
2023-08-31Towards Optimal Patch Size in Vision Transformers for Tumor SegmentationRamtin Mojtahedi et.al.2308.16598:mortar_board:Code
2023-08-30Learning Diverse Features in Vision Transformers for Improved GeneralizationArmand Mihai Nicolicioiu et.al.2308.16274:mortar_board:Code
2023-08-30Emergence of Segmentation with Minimalistic White-Box TransformersYaodong Yu et.al.2308.16271:mortar_board:Code
2023-08-29Efficient Model Personalization in Federated Learning via Client-Specific Prompt GenerationFu-En Yang et.al.2308.15367:mortar_board:None
2023-08-29Imperceptible Adversarial Attack on Deep Neural Networks from Image BoundaryFahad Alrasheedi et.al.2308.15344:mortar_board:None
2023-08-29TKwinFormer: Top k Window Attention in Vision Transformers for Feature MatchingYun Liao et.al.2308.15144:mortar_board:None
2023-08-28PanoSwin: a Pano-style Swin Transformer for Panorama UnderstandingZhixin Ling et.al.2308.14726:mortar_board:None
2023-08-28Fast Feedforward NetworksPeter Belcak et.al.2308.14711:mortar_board:Code
2023-08-28FIRE: Food Image to REcipe generationPrateek Chhikara et.al.2308.14391:mortar_board:None
2023-08-28GKGNet: Group K-Nearest Neighbor based Graph Convolutional Network for Multi-Label Image RecognitionRuijie Yao et.al.2308.14378:mortar_board:None
2023-08-27A comprehensive review on Plant Leaf Disease detection using Deep learningSumaya Mustofa et.al.2308.14087:mortar_board:None
2023-08-27Domain-Specificity Inducing Transformers for Source-Free Domain AdaptationSunandini Sanyal et.al.2308.14023:mortar_board:None
2023-08-26Fixating on Attention: Integrating Human Eye Tracking into Vision TransformersSharath Koorathota et.al.2308.13969:mortar_board:None
2023-08-26Unified Single-Stage Transformer Network for Efficient RGB-T TrackingJianqiang Xia et.al.2308.13764:mortar_board:None
2023-08-25ACC-UNet: A Completely Convolutional UNet model for the 2020sNabil Ibtehaz et.al.2308.13680:mortar_board:Code
2023-08-25Enhancing Landmark Detection in Cluttered Real-World Scenarios with Vision TransformersMohammad Javad Rajabi et.al.2308.13671:mortar_board:None
2023-08-25Eventful Transformers: Leveraging Temporal Redundancy in Vision TransformersMatthew Dutson et.al.2308.13494:mortar_board:Code
2023-08-25An investigation into the impact of deep learning model choice on sex and race bias in cardiac MR segmentationTiarna Lee et.al.2308.13415:mortar_board:None
2023-08-25CS-Mixer: A Cross-Scale Vision MLP Model with Spatial-Channel MixingJonathan Cui et.al.2308.13363:mortar_board:None
2023-08-25A Re-Parameterized Vision Transformer (ReVT) for Domain-Generalized Semantic SegmentationJan-Aike Termöhlen et.al.2308.13331:mortar_board:None
2023-08-24Full-dose PET Synthesis from Low-dose PET Using High-efficiency Diffusion Denoising Probabilistic ModelShaoyan Pan et.al.2308.13072:mortar_board:None
2023-08-24Spherical Vision Transformer for 360-degree Video Saliency PredictionMert Cokelek et.al.2308.13004:mortar_board:None
2023-08-24Towards Hierarchical Regional Transformer-based Multiple Instance LearningJosef Cersovsky et.al.2308.12634:mortar_board:None
2023-08-24SieveNet: Selecting Point-Based Features for Mesh NetworksShengchao Yuan et.al.2308.12530:mortar_board:None
2023-08-23MOFO: MOtion FOcused Self-Supervision for Video UnderstandingMona Ahmadian et.al.2308.12447:mortar_board:None
2023-08-23BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input DetectionTinghao Xie et.al.2308.12439:mortar_board:None
2023-08-23Vision Transformer Adapters for Generalizable Multitask LearningDeblina Bhattacharjee et.al.2308.12372:mortar_board:None
2023-08-23SPPNet: A Single-Point Prompt Network for Nuclei Image SegmentationQing Xu et.al.2308.12231:mortar_board:Code
2023-08-23SG-Former: Self-guided Transformer with Evolving Token ReallocationSucheng Ren et.al.2308.12216:mortar_board:None
2023-08-23Masking Strategies for Background Bias Removal in Computer Vision ModelsAnanthu Aniraj et.al.2308.12127:mortar_board:Code
2023-08-23Local Distortion Aware Efficient Transformer Adaptation for Image Quality AssessmentKangmin Xu et.al.2308.12001:mortar_board:None
2023-08-22SwinFace: A Multi-task Transformer for Face Recognition, Expression Recognition, Age Estimation and Attribute EstimationLixiong Qin et.al.2308.11509:mortar_board:Code
2023-08-22Masked Momentum Contrastive Learning for Zero-shot Semantic UnderstandingJiantao Wu et.al.2308.11448:mortar_board:None
2023-08-22SAIPy: A Python Package for single station Earthquake Monitoring using Deep LearningWei Li et.al.2308.11428:mortar_board:None
2023-08-22TurboViT: Generating Fast Vision Transformers via Generative Architecture SearchAlexander Wong et.al.2308.11421:mortar_board:None
2023-08-22Exemplar-Free Continual Transformer with ConvolutionsAnurag Roy et.al.2308.11357:mortar_board:None
2023-08-21Vision Transformer Pruning Via Matrix DecompositionTianyi Sun et.al.2308.10839:mortar_board:None
2023-08-21Jumping through Local Minima: Quantization in the Loss Landscape of Vision TransformersNatalia Frumkin et.al.2308.10814:mortar_board:None
2023-08-21Patch Is Not All You NeedChangzhen Li et.al.2308.10729:mortar_board:None
2023-08-21Spatial Transform Decoupling for Oriented Object DetectionHongtian Yu et.al.2308.10561:mortar_board:Code
2023-08-21Joint learning of images and videos with a single Vision TransformerShuki Shimizu et.al.2308.10533:mortar_board:None
2023-08-20Improving Adversarial Robustness of Masked Autoencoders via Test-time Frequency-domain PromptingQidong Huang et.al.2308.10315:mortar_board:Code
2023-08-20FedSIS: Federated Split Learning with Intermediate Representation Sampling for Privacy-preserving Generalized Face Presentation Attack DetectionNaif Alkhunaizi et.al.2308.10236:mortar_board:Code
2023-08-20TransFace: Calibrating Transformer Training for Face Recognition from a Data-Centric PerspectiveJun Dan et.al.2308.10133:mortar_board:Code
2023-08-19Towards a High-Performance Object Detector: Insights from Drone Detection Using ViT and CNN-based Deep Learning ModelsJunyang Zhang et.al.2308.09899:mortar_board:None
2023-08-18On the Effectiveness of LayerNorm Tuning for Continual Learning in Vision TransformersThomas De Min et.al.2308.09610:mortar_board:Code
2023-08-18Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision TransformersTobias Christian Nauen et.al.2308.09372:mortar_board:Code
2023-08-17FedPerfix: Towards Partial Model Personalization of Vision Transformers in Federated LearningGuangyu Sun et.al.2308.09160:mortar_board:Code
2023-08-17SimFIR: A Simple Framework for Fisheye Image Rectification with Self-supervised Representation LearningHao Feng et.al.2308.09040:mortar_board:None
2023-08-16SkinDistilViT: Lightweight Vision Transformer for Skin Lesion ClassificationVlad-Constantin Lungu-Stan et.al.2308.08669:mortar_board:None
2023-08-15SEDA: Self-Ensembling ViT with Defensive Distillation and Adversarial Training for robust Chest X-rays ClassificationRaza Imam et.al.2308.07874:mortar_board:Code
2023-08-15Fast Machine Unlearning Without Retraining Through Selective Synaptic DampeningJack Foster et.al.2308.07707:mortar_board:Code
2023-08-15Enhancing Network Initialization for Medical AI Models Using Large-Scale, Unlabeled Natural ImagesSoroosh Tayebi Arasteh et.al.2308.07688:mortar_board:None
2023-08-15Block-Wise Encryption for Reliable Vision Transformer modelsHitoshi Kiya et.al.2308.07612:mortar_board:None
2023-08-14Wide-Area Geolocalization with a Limited Field of View Camera in Challenging Urban EnvironmentsLena M. Downes et.al.2308.07432:mortar_board:None
2023-08-14A Unified Masked Autoencoder with Patchified Skeletons for Motion SynthesisEsteve Valls Mascaro et.al.2308.07301:mortar_board:None
2023-08-14A Robust Approach Towards Distinguishing Natural and Computer Generated Images using Multi-Colorspace fused and Enriched Vision TransformerManjary P Gangan et.al.2308.07279:mortar_board:Code
2023-08-14Large-kernel Attention for Efficient and Robust Brain Lesion SegmentationLiam Chalcroft et.al.2308.07251:mortar_board:Code
2023-08-14SCSC: Spatial Cross-scale Convolution Module to Strengthen both CNNs and TransformersXijun Wang et.al.2308.07110:mortar_board:None
2023-08-14Exploring Lightweight Hierarchical Vision Transformers for Efficient Visual TrackingBen Kang et.al.2308.06904:mortar_board:Code
2023-08-13Modified Topological Image Preprocessing for Skin Lesion ClassificationsHong Cheng et.al.2308.06796:mortar_board:None
2023-08-12Revisiting Vision Transformer from the View of Path EnsembleShuning Chang et.al.2308.06548:mortar_board:None
2023-08-12Performance Analysis for Resource Constrained Decentralized Federated Learning Over Wireless NetworksZhigang Yan et.al.2308.06496:mortar_board:None
2023-08-11Experts Weights Averaging: A New General Training Scheme for Vision TransformersYongqi Huang et.al.2308.06093:mortar_board:None
2023-08-10Vision Backbone Enhancement via Multi-Stage Cross-Scale AttentionLiang Shang et.al.2308.05872:mortar_board:None
2023-08-10Temporally-Adaptive Models for Efficient Video UnderstandingZiyuan Huang et.al.2308.05787:mortar_board:Code
2023-08-10Surface Masked AutoEncoder: Self-Supervision for Cortical Imaging DataSimon Dahan et.al.2308.05474:mortar_board:Code
2023-08-09Spatial Gated Multi-Layer Perceptron for Land Use and Land Cover MappingAli Jamali et.al.2308.05235:mortar_board:Code
2023-08-09A degree of image identification at sub-human scales could be possible with more advanced clustersPrateek Y J et.al.2308.05092:mortar_board:Code
2023-08-09Which Tokens to Use? Investigating Token Reduction in Vision TransformersJoakim Bruslund Haurum et.al.2308.04657:mortar_board:None
2023-08-08Unsupervised Camouflaged Object Segmentation as Domain AdaptationYi Zhang et.al.2308.04528:mortar_board:Code
2023-08-08All-pairs Consistency Learning for Weakly Supervised Semantic SegmentationWeixuan Sun et.al.2308.04321:mortar_board:None
2023-08-08Class-level Structural Relation Modelling and Smoothing for Visual Representation LearningZitan Chen et.al.2308.04142:mortar_board:Code
2023-08-07Communication-Efficient Framework for Distributed Image Semantic Wireless TransmissionBingyan Xie et.al.2308.03713:mortar_board:None
2023-08-07Scaling may be all you need for achieving human-level object recognition capacity with human-like visual experienceA. Emin Orhan et.al.2308.03712:mortar_board:Code
2023-08-07Improving FHB Screening in Wheat Breeding Using an Efficient Transformer ModelBabak Azad et.al.2308.03670:mortar_board:None
2023-08-07DiT: Efficient Vision Transformers with Dynamic Token RoutingYuchen Ma et.al.2308.03409:mortar_board:Code
2023-08-07Part-Aware Transformer for Generalizable Person Re-identificationHao Ni et.al.2308.03322:mortar_board:None
2023-08-07FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization SearchJordan Dotzel et.al.2308.03290:mortar_board:None
2023-08-06TOPIQ: A Top-down Approach from Semantics to Distortions for Image Quality AssessmentChaofeng Chen et.al.2308.03060:mortar_board:Code
2023-08-06High-Resolution Vision Transformers for Pixel-Level Identification of Structural Components and DamageKareem Eltouny et.al.2308.03006:mortar_board:None
2023-08-06MCTformer+: Multi-Class Token Transformer for Weakly Supervised Semantic SegmentationLian Xu et.al.2308.03005:mortar_board:Code
2023-08-05Unfolding Once is Enough: A Deployment-Friendly Transformer Unit for Super-ResolutionYong Liu et.al.2308.02794:mortar_board:Code
2023-08-04M2Former: Multi-Scale Patch Selection for Fine-Grained Visual RecognitionJiyong Moon et.al.2308.02161:mortar_board:None
2023-08-04Breast Ultrasound Tumor Classification Using a Hybrid Multitask CNN-Transformer NetworkBryar Shareef et.al.2308.02101:mortar_board:None
2023-08-03A Multidimensional Analysis of Social Biases in Vision TransformersJannik Brinkmann et.al.2308.01948:mortar_board:None
2023-08-03Dynamic Token-Pass Transformers for Semantic SegmentationYuang Liu et.al.2308.01944:mortar_board:None
2023-08-02A vision transformer-based framework for knowledge transfer from multi-modal to mono-modal lymphoma subtyping modelsBilel Guetarni et.al.2308.01328:mortar_board:None
2023-08-02Dynamic Token Pruning in Plain Vision Transformers for Semantic SegmentationQuan Tang et.al.2308.01045:mortar_board:None
2023-08-01DINO-CXR: A self supervised method based on vision transformer for chest X-ray classificationMohammadreza Shakouri et.al.2308.00475:mortar_board:None
2023-08-01ViT2EEG: Leveraging Hybrid Pretrained Vision Transformers for EEG DataRuiqi Yang et.al.2308.00454:mortar_board:Code
2023-08-01FLatten Transformer: Vision Transformer using Focused Linear AttentionDongchen Han et.al.2308.00442:mortar_board:Code
2023-08-01Enhanced Security with Encrypted Vision Transformer in Federated LearningRei Aso et.al.2308.00271:mortar_board:None
2023-08-01Improving Pixel-based MIM by Reducing Wasted Modeling CapabilityYuan Liu et.al.2308.00261:mortar_board:Code
2023-08-01LGViT: Dynamic Early Exiting for Accelerating Vision TransformerGuanyu Xu et.al.2308.00255:mortar_board:None
2023-07-31Performance Evaluation of Swin Vision Transformer Model using Gradient Accumulation Optimization TechniqueSanad Aburass et.al.2308.00197:mortar_board:None
2023-07-30StylePrompter: All Styles Need Is AttentionChenyi Zhuang et.al.2307.16151:mortar_board:Code
2023-07-29HandMIM: Pose-Aware Self-Supervised Learning for 3D Hand Mesh EstimationZuyan Liu et.al.2307.16061:mortar_board:None
2023-07-29CoVid-19 Detection leveraging Vision Transformers and Explainable AIPangoth Santhosh Kumar et.al.2307.16033:mortar_board:None
2023-07-27Self-Supervised Graph Transformer for Deepfake DetectionAminollah Khormali et.al.2307.15019:mortar_board:None
2023-07-27IML-ViT: Image Manipulation Localization by Vision TransformerXiaochen Ma et.al.2307.14863:mortar_board:Code
2023-07-27Pre-training Vision Transformers with Very Limited Synthesized ImagesRyo Nakamura et.al.2307.14710:mortar_board:Code
2023-07-26MiDaS v3.1 – A Model Zoo for Robust Monocular Relative Depth EstimationReiner Birkl et.al.2307.14460:mortar_board:Code
2023-07-26Sparse Double Descent in Vision Transformers: real or phantom threat?Victor Quétu et.al.2307.14253:mortar_board:Code
2023-07-26Boon: A Neural Search Engine for Cross-Modal Information RetrievalYan Gong et.al.2307.14240:mortar_board:None
2023-07-26Adaptive Frequency Filters As Efficient Global Token MixersZhipeng Huang et.al.2307.14008:mortar_board:None
2023-07-26Enhanced Security against Adversarial Examples Using a Random Ensemble of Encrypted Vision Transformer ModelsRyota Iijima et.al.2307.13985:mortar_board:None
2023-07-26Understanding Deep Neural Networks via Linear Separability of Hidden LayersChao Zhang et.al.2307.13962:mortar_board:None
2023-07-26Visual Prompt Flexible-Modal Face Anti-SpoofingZitong Yu et.al.2307.13958:mortar_board:None
2023-07-26AViT: Adapting Vision Transformers for Small Skin Lesion Segmentation DatasetsSiyi Du et.al.2307.13897:mortar_board:Code
2023-07-25On the unreasonable vulnerability of transformers for image restoration – and an easy fixShashank Agnihotri et.al.2307.13856:mortar_board:None
2023-07-25Optical Flow boosts Unsupervised Localization and SegmentationXinyu Zhang et.al.2307.13640:mortar_board:Code
2023-07-25Conditional Cross Attention Network for Multi-Space Embedding without Entanglement in Only a SINGLE NetworkChull Hwan Song et.al.2307.13254:mortar_board:None
2023-07-25Multi-Granularity Prediction with Learnable Fusion for Scene Text RecognitionCheng Da et.al.2307.13244:mortar_board:Code
2023-07-24AMAE: Adaptation of Pre-Trained Masked Autoencoder for Dual-Distribution Anomaly Detection in Chest X-RaysBehzad Bozorgtabar et.al.2307.12721:mortar_board:None
2023-07-24SwinMM: Masked Multi-view with Swin Transformers for 3D Medical Image SegmentationYiqing Wang et.al.2307.12591:mortar_board:Code
2023-07-24A Good Student is Cooperative and Reliable: CNN-Transformer Collaborative Learning for Semantic SegmentationJinjing Zhu et.al.2307.12574:mortar_board:None
2023-07-24Robust face anti-spoofing framework with Convolutional Vision TransformerYunseung Lee et.al.2307.12459:mortar_board:None
2023-07-23Iterative Robust Visual Grounding with Masked Reference based Centerpoint SupervisionMenghao Li et.al.2307.12392:mortar_board:Code
2023-07-22Sparse then Prune: Toward Efficient Vision TransformersYogi Prasetyo et.al.2307.11988:mortar_board:Code
2023-07-21Latent-OFER: Detect, Mask, and Reconstruct with Latent Vectors for Occluded Facial Expression RecognitionIsack Lee et.al.2307.11404:mortar_board:Code
2023-07-20Towards General Game Representations: Decomposing Games Pixels into Content and StyleChintan Trivedi et.al.2307.11141:mortar_board:None
2023-07-20Comparison between transformers and convolutional models for fine-grained classification of insectsRita Pucci et.al.2307.11112:mortar_board:None
2023-07-20GLSFormer: Gated - Long, Short Sequence Transformer for Step Recognition in Surgical VideosNisarg A. Shah et.al.2307.11081:mortar_board:Code
2023-07-20Learned Thresholds Token Merging and Pruning for Vision TransformersMaxim Bonnaerens et.al.2307.10780:mortar_board:Code
2023-07-20Reverse Knowledge Distillation: Training a Large Model using a Small One for Retinal Image Matching on Limited DataSahar Almahfouz Nasser et.al.2307.10698:mortar_board:Code
2023-07-20Quantized Feature Distillation for Network QuantizationKe Zhu et.al.2307.10638:mortar_board:None
2023-07-17Study of Vision Transformers for Covid-19 Detection from Chest X-raysSandeep Angara et.al.2307.09402:mortar_board:None
2023-07-18MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook AssignmentsSpyros Gidaris et.al.2307.09361:mortar_board:None
2023-07-18RepViT: Revisiting Mobile CNN From ViT PerspectiveAo Wang et.al.2307.09283:mortar_board:Code
2023-07-18Light-Weight Vision Transformer with Parallel Local and Global Self-AttentionNikolas Ebert et.al.2307.09120:mortar_board:None
2023-07-18NU-MCC: Multiview Compressive Coding with Neighborhood Decoder and Repulsive UDFStefan Lionar et.al.2307.09112:mortar_board:None
2023-07-18R-Cut: Enhancing Explainability in Vision Transformers with Relationship Weighted Out and CutYingjie Niu et.al.2307.09050:mortar_board:None
2023-07-18Human Action Recognition in Still Images Using ConViTSeyed Rohollah Hosseyni et.al.2307.08994:mortar_board:None
2023-07-17Scale-Aware Modulation Meet TransformerWeifeng Lin et.al.2307.08579:mortar_board:Code
2023-07-17BUS:Efficient and Effective Vision-language Pre-training with Bottom-Up Patch SummarizationChaoya Jiang et.al.2307.08504:mortar_board:None
2023-07-17Cumulative Spatial Knowledge Distillation for Vision TransformersBorui Zhao et.al.2307.08500:mortar_board:None
2023-07-17ShiftNAS: Improving One-shot NAS via Probability ShiftMingyang Zhang et.al.2307.08300:mortar_board:Code
2023-07-17Uncertainty-aware State Space Transformer for Egocentric 3D Hand Trajectory ForecastingWentao Bao et.al.2307.08243:mortar_board:None
2023-07-16Domain Generalisation with Bidirectional Encoder Representations from Vision TransformersHamza Riaz et.al.2307.08117:mortar_board:None
2023-07-16Dense Multitask Learning to Reconfigure ComicsDeblina Bhattacharjee et.al.2307.08071:mortar_board:None
2023-07-16A Survey of Techniques for Optimizing Transformer InferenceKrishna Teja Chitty-Venkata et.al.2307.07982:mortar_board:None
2023-07-16S2R-ViT for Multi-Agent Cooperative Perception: Bridging the Gap from Simulation to RealityJinlong Li et.al.2307.07935:mortar_board:None
2023-07-14TALL: Thumbnail Layout for Deepfake Video DetectionYuting Xu et.al.2307.07494:mortar_board:None
2023-07-14Multimodal Distillation for Egocentric Action RecognitionGorjan Radevski et.al.2307.07483:mortar_board:None
2023-07-14BiGSeT: Binary Mask-Guided Separation Training for DNN-based Hyperspectral Anomaly DetectionHaijun Liu et.al.2307.07428:mortar_board:None
2023-07-14HEAL-SWIN: A Vision Transformer On The SphereOscar Carlsson et.al.2307.07313:mortar_board:Code
2023-07-14MaxSR: Image Super-Resolution Using Improved MaxViTBincheng Yang et.al.2307.07240:mortar_board:None
2023-07-13Deepfake Video Detection Using Generative Convolutional Vision TransformerDeressa Wodajo et.al.2307.07036:mortar_board:Code
2023-07-12Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and ResolutionMostafa Dehghani et.al.2307.06304:mortar_board:None
2023-07-12What Happens During Finetuning of Vision Transformers: An Invariance Based InvestigationGabriele Merlin et.al.2307.06006:mortar_board:None
2023-07-11PIGEON: Predicting Image GeolocationsLukas Haas et.al.2307.05845:mortar_board:None
2023-07-11Image Reconstruction using Enhanced Vision TransformerNikhil Verma et.al.2307.05616:mortar_board:None
2023-07-10MiVOLO: Multi-input Transformer for Age and Gender EstimationMaksim Kuprashevich et.al.2307.04616:mortar_board:Code
2023-07-10Source-Free Open-Set Domain Adaptation for Histopathological Images via Distilling Self-Supervised Vision TransformerGuillaume Vray et.al.2307.04596:mortar_board:None
2023-07-10One-Shot Pruning for Fast-adapting Pre-trained Models on DevicesHaiyan Zhao et.al.2307.04365:mortar_board:None
2023-07-09Cross-modal Orthogonal High-rank Augmentation for RGB-Event Transformer-trackersZhiyu Zhu et.al.2307.04129:mortar_board:None
2023-07-09Random Position Adversarial Patch for Vision TransformersMingzhen Shao et.al.2307.04066:mortar_board:None
2023-07-08Understanding the Efficacy of U-Net & Vision Transformer for Groundwater Numerical ModellingMaria Luisa Taccari et.al.2307.04010:mortar_board:None
2023-07-07INT-FP-QSim: Mixed Precision and Formats For Large Language Models and Vision TransformersLakshmi Nair et.al.2307.03712:mortar_board:Code
2023-07-07HoughLaneNet: Lane Detection with Deep Hough Transform and Dynamic ConvolutionJia-Qi Zhang et.al.2307.03494:mortar_board:None
2023-07-07Distilling Self-Supervised Vision Transformers for Weakly-Supervised Few-Shot Classification & SegmentationDahyun Kang et.al.2307.03407:mortar_board:None
2023-07-06Origin-Destination Travel Time Oracle for Map-based ServicesYan Lin et.al.2307.03048:mortar_board:None
2023-07-06Art Authentication with Vision TransformersLudovica Schaerf et.al.2307.03039:mortar_board:None
2023-07-05MSViT: Dynamic Mixed-Scale Tokenization for Vision TransformersJakob Drachmann Havtorn et.al.2307.02321:mortar_board:None
2023-07-05Interactive Image Segmentation with Cross-Modality Vision TransformersKun Li et.al.2307.02280:mortar_board:Code
2023-07-05MAE-DFER: Efficient Masked Autoencoder for Self-supervised Dynamic Facial Expression RecognitionLicai Sun et.al.2307.02227:mortar_board:Code
2023-07-05Harmonizing Feature Attributions Across Deep Learning Architectures: Enhancing Interpretability and ConsistencyMd Abdul Kadir et.al.2307.02150:mortar_board:None
2023-07-05MDViT: Multi-domain Vision Transformer for Small Medical Image Segmentation DatasetsSiyi Du et.al.2307.02100:mortar_board:Code
2023-07-05Make A Long Image Short: Adaptive Token Length for Vision TransformersQiqi Zhou et.al.2307.02092:mortar_board:None
2023-07-04Deep Features for Contactless Fingerprint Presentation Attack Detection: Can They Be Generalized?Hailin Li et.al.2307.01845:mortar_board:None
2023-07-04In-Domain Self-Supervised Learning Can Lead to Improvements in Remote Sensing Image ClassificationIvica Dimitrovski et.al.2307.01645:mortar_board:None
2023-07-03Streamlined Lensed Quasar Identification in Multiband Images via Ensemble NetworksIrham Taufik Andika et.al.2307.01090:mortar_board:None
2023-07-02X-MLP: A Patch Embedding-Free MLP Architecture for VisionXinyue Wang et.al.2307.00592:mortar_board:None
2023-07-01WavePaint: Resource-efficient Token-mixer for Self-supervised InpaintingPranav Jeevan et.al.2307.00407:mortar_board:Code
2023-07-01MobileViG: Graph-Based Sparse Attention for Mobile Vision ApplicationsMustafa Munir et.al.2307.00395:mortar_board:Code
2023-07-01Variation-aware Vision Transformer QuantizationXijie Huang et.al.2307.00331:mortar_board:Code
2023-07-01More for Less: Compact Convolutional Transformers Enable Robust Medical Image Classification with Limited DataAndrew Kean Gao et.al.2307.00213:mortar_board:None
2023-06-30Stitched ViTs are Flexible Vision BackbonesZizheng Pan et.al.2307.00154:mortar_board:Code
2023-06-30Hardwiring ViT Patch Selectivity into CNNs using Patch MixingAriel N. Lee et.al.2306.17848:mortar_board:None
2023-06-30HVTSurv: Hierarchical Vision Transformer for Patient-Level Survival Prediction from Whole Slide ImageZhuchen Shao et.al.2306.17373:mortar_board:Code
2023-06-29An Efficient General-Purpose Modular Vision Model via Multi-Task Heterogeneous TrainingZitian Chen et.al.2306.17165:mortar_board:None
2023-06-29Learning Structure-Guided Diffusion Model for 2D Human Pose EstimationZhongwei Qiu et.al.2306.17074:mortar_board:None
2023-06-29Spatial Reasoning via Deep Vision Models for Robotic Sequential ManipulationHongyou Zhou et.al.2306.17053:mortar_board:None
2023-06-29BinaryViT: Pushing Binary Vision Transformers Towards Convolutional ModelsPhuoc-Hoan Charles Le et.al.2306.16678:mortar_board:Code
2023-06-27CellViT: Vision Transformers for Precise Cell Segmentation and ClassificationFabian Hörst et.al.2306.15350:mortar_board:Code
2023-06-27Novel Hybrid-Learning Algorithms for Improved Millimeter-Wave Imaging SystemsJosiah Smith et.al.2306.15341:mortar_board:None
2023-06-27Towards predicting Pedestrian Evacuation Time and Density from Floorplans using a Vision TransformerPatrick Berggold et.al.2306.15318:mortar_board:None
2023-06-26FeSViBS: Federated Split Learning of Vision Transformer with Block SamplingFaris Almalik et.al.2306.14638:mortar_board:Code
2023-06-25Adaptive Window Pruning for Efficient Local Motion DeblurringHaoying Li et.al.2306.14268:mortar_board:None
2023-06-23Swin-Free: Achieving Better Cross-Window Attention and Efficiency with Size-varying WindowJinkyu Koo et.al.2306.13776:mortar_board:None
2023-06-23ProRes: Exploring Degradation-aware Visual Prompt for Universal Image RestorationJiaqi Ma et.al.2306.13653:mortar_board:Code
2023-06-22Quantizable Transformers: Removing Outliers by Helping Attention Heads Do NothingYelysei Bondarenko et.al.2306.12929:mortar_board:None
2023-06-21Inter-Instance Similarity Modeling for Contrastive LearningChengchao Shen et.al.2306.12243:mortar_board:Code
2023-06-21ViTEraser: Harnessing the Power of Vision Transformers for Scene Text Removal with SegMIM PretrainingDezhi Peng et.al.2306.12106:mortar_board:Code
2023-06-19RaViTT: Random Vision Transformer TokensFelipe A. Quezada et.al.2306.10959:mortar_board:None
2023-06-19TeleViT: Teleconnection-driven Transformers Improve Subseasonal to Seasonal Wildfire ForecastingIoannis Prapas et.al.2306.10940:mortar_board:Code
2023-06-19B-cos Alignment for Inherently Interpretable CNNs and Vision TransformersMoritz Böhle et.al.2306.10898:mortar_board:None
2023-06-19Vision Transformer with Attention Map Hallucination and FFN CompactionHaiyang Xu et.al.2306.10875:mortar_board:None
2023-06-16Group Orthogonalization Regularization For Vision Models Adaptation and RobustnessYoav Kurtz et.al.2306.10001:mortar_board:Code
2023-06-16LabelBench: A Comprehensive Framework for Benchmarking Label-Efficient LearningJifan Zhang et.al.2306.09910:mortar_board:Code
2023-06-15Seeing the Pose in the Pixels: Learning Pose-Aware Representations in Vision TransformersDominick Reilly et.al.2306.09331:mortar_board:Code
2023-06-15Neural Fine-Tuning Search for Few-Shot LearningPanagiotis Eustratiadis et.al.2306.09295:mortar_board:Code
2023-06-15ViP: A Differentially Private Foundation Model for Computer VisionYaodong Yu et.al.2306.08842:mortar_board:None
2023-06-14Hippocampus Substructure Segmentation Using Morphological Vision Transformer LearningYang Lei et.al.2306.08723:mortar_board:None
2023-06-13Rethinking Polyp Segmentation from an Out-of-Distribution PerspectiveGe-Peng Ji et.al.2306.07792:mortar_board:None
2023-06-13Reviving Shift Equivariance in Vision TransformersPeijian Ding et.al.2306.07470:mortar_board:None
2023-06-12Learning to Mask and Permute Visual Tokens for Vision Transformer Pre-TrainingLorenzo Baraldi et.al.2306.07346:mortar_board:Code
2023-06-12Revisiting Token Pruning for Object Detection and Instance SegmentationYifei Liu et.al.2306.07050:mortar_board:None
2023-06-12Enhancing COVID-19 Diagnosis through Vision Transformer-Based Analysis of Chest X-ray ImagesSultan Zavrak et.al.2306.06914:mortar_board:None
2023-06-12Unmasking Deepfakes: Masked Autoencoding Spatiotemporal Transformers for Enhanced Video Forgery DetectionSayantan Das et.al.2306.06881:mortar_board:None
2023-06-11E(2)E(2)-Equivariant Vision TransformerRenjun Xu et.al.2306.06722:mortar_board:Code
2023-06-112-D SSM: A General Spatial Layer for Visual TransformersEthan Baron et.al.2306.06635:mortar_board:Code
2023-06-10Vista-Morph: Unsupervised Image Registration of Visible-Thermal Facial PairsCatherine Ordun et.al.2306.06505:mortar_board:None
2023-06-10ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision TransformerHaoran You et.al.2306.06446:mortar_board:None
2023-06-09SegViTv2: Exploring Efficient and Continual Semantic Segmentation with Plain Vision TransformersBowen Zhang et.al.2306.06289:mortar_board:Code
2023-06-09FLSL: Feature-level Self-supervised LearningQing Su et.al.2306.06203:mortar_board:None
2023-06-09FasterViT: Fast Vision Transformers with Hierarchical AttentionAli Hatamizadeh et.al.2306.06189:mortar_board:Code
2023-06-09Customizing General-Purpose Foundation Models for Medical Report GenerationBang Yang et.al.2306.05642:mortar_board:None
2023-06-08Is Attentional Channel Processing Design Required? Comprehensive Analysis Of Robustness Between Vision Transformers And Fully Attentional NetworksAbhishri Ajit Medewar et.al.2306.05495:mortar_board:None
2023-06-08Connectional-Style-Guided Contextual Representation Learning for Brain Disease DiagnosisGongshu Wang et.al.2306.05297:mortar_board:None
2023-06-08Improving Visual Prompt Tuning for Self-supervised Vision TransformersSeungryong Yoo et.al.2306.05067:mortar_board:Code
2023-06-08Neighborhood Attention Makes the Encoder of ResUNet Stronger for Accurate Road ExtractionAli Jamali et.al.2306.04947:mortar_board:None
2023-06-08Muti-Scale And Token Mergence: Make Your ViT More EfficientZhe Bian et.al.2306.04897:mortar_board:None
2023-06-07Optimizing ViViT Training: Time and Memory Reduction for Action RecognitionShreyank N Gowda et.al.2306.04822:mortar_board:None
2023-06-07Revising deep learning methods in parking lot occupancy detectionAnastasia Martynova et.al.2306.04288:mortar_board:Code
2023-06-07Normalization Layers Are All That Sharpness-Aware Minimization NeedsMaximilian Mueller et.al.2306.04226:mortar_board:Code
2023-06-07Efficient Vision Transformer for Human Pose Estimation via Patch SelectionKaleab A. Kinfu et.al.2306.04225:mortar_board:None
2023-06-07TEC-Net: Vision Transformer Embrace Convolutional Neural Networks for Medical Image SegmentationTao Lei et.al.2306.04086:mortar_board:Code
2023-06-06Human-imperceptible, Machine-recognizable ImagesFusheng Hao et.al.2306.03679:mortar_board:Code
2023-06-06LegoNet: Alternating Model Blocks for Medical Image SegmentationIkboljon Sobirov et.al.2306.03494:mortar_board:None
2023-06-06Efficient Anomaly Detection with Budget Annotation Using Semi-Supervised Residual TransformerHanxi Li et.al.2306.03492:mortar_board:None
2023-06-06Clinical-Inspired Cytological Whole Slide Image Screening with Just Slide-Level LabelsBeidi Zhao et.al.2306.03407:mortar_board:None
2023-06-06CiT-Net: Convolutional Neural Networks Hand in Hand with Vision Transformers for Medical Image SegmentationTao Lei et.al.2306.03373:mortar_board:Code
2023-06-05A Vessel-Segmentation-Based CycleGAN for Unpaired Multi-modal Retinal Image SynthesisAline Sindel et.al.2306.02901:mortar_board:None
2023-06-05Learning Probabilistic Symmetrization for Architecture Agnostic EquivarianceJinwoo Kim et.al.2306.02866:mortar_board:Code
2023-06-05On the Role of ViT and CNN in Semantic Communications: Analysis and Prototype ValidationHanju Yoo et.al.2306.02759:mortar_board:None
2023-06-03TransDocAnalyser: A Framework for Offline Semi-structured Handwritten Document Analysis in the Legal DomainSagar Chakraborty et.al.2306.02142:mortar_board:Code
2023-06-03Content-aware Token Sharing for Efficient Semantic Segmentation with Vision TransformersChenyang Lu et.al.2306.02095:mortar_board:Code
2023-06-03Memorization Capacity of Multi-Head Attention in TransformersSadegh Mahdavi et.al.2306.02010:mortar_board:Code
2023-06-02Recent Advances of Local Mechanisms in Computer Vision: A Survey and Outlook of Recent WorkQiangchang Wang et.al.2306.01929:mortar_board:None
2023-06-02A Novel Vision Transformer with Residual in Self-attention for Biomedical Image ClassificationArun K. Sharma et.al.2306.01594:mortar_board:None
2023-06-02NNMobile-Net: Rethinking CNN Design for Deep Learning-Based Retinopathy ResearchWenhui Zhu et.al.2306.01289:mortar_board:Code
2023-06-01Hiera: A Hierarchical Vision Transformer without the Bells-and-WhistlesChaitanya Ryali et.al.2306.00989:mortar_board:Code
2023-06-01DeepFake-Adapter: Dual-Level Adapter for DeepFake DetectionRui Shao et.al.2306.00863:mortar_board:Code
2023-06-01Auto-Spikformer: Spikformer Architecture SearchKaiwei Che et.al.2306.00807:mortar_board:None
2023-06-01DAM-Net: Global Flood Detection from SAR Imagery Using Differential Attention Metric-Based Vision TransformersTamer Saleh et.al.2306.00704:mortar_board:Code
2023-06-01Lightweight Vision Transformer with Bidirectional InteractionQihang Fan et.al.2306.00396:mortar_board:Code
2023-06-01Affinity-based Attention in Self-supervised Transformers Predicts Dynamics of Object Grouping in HumansHossein Adeli et.al.2306.00294:mortar_board:Code
2023-05-31Self-supervised Vision Transformers for 3D Pose Estimation of Novel ObjectsStefan Thalhammer et.al.2306.00129:mortar_board:Code
2023-05-31Diagnosis and Prognosis of Head and Neck Cancer Patients using Artificial IntelligenceIkboljon Sobirov et.al.2306.00034:mortar_board:None
2023-05-31LOWA: Localize Objects in the Wild with AttributesXiaoyuan Guo et.al.2305.20047:mortar_board:None
2023-05-31DOTA: A Dynamically-Operated Photonic Tensor Core for Energy-Efficient Transformer AcceleratorHanqing Zhu et.al.2305.19533:mortar_board:None
2023-05-31CVSNet: A Computer Implementation for Central Visual System of The BrainRuimin Gao et.al.2305.19492:mortar_board:None
2023-05-30Are Large Kernels Better Teachers than Transformers for ConvNets?Tianjin Huang et.al.2305.19412:mortar_board:Code
2023-05-30Contextual Vision Transformers for Robust Representation LearningYujia Bao et.al.2305.19402:mortar_board:None
2023-05-30Vision Transformers for Mobile Applications: A Short SurveyNahid Alam et.al.2305.19365:mortar_board:None
2023-05-30Prompt-based Tuning of Transformer Models for Multi-Center Medical Image SegmentationNuman Saeed et.al.2305.18948:mortar_board:None
2023-05-30Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-ExpertsRishov Sarkar et.al.2305.18691:mortar_board:Code
2023-05-29Solar Irradiance Anticipative TransformerThomas M. Mercier et.al.2305.18487:mortar_board:Code
2023-05-29DiffRate : Differentiable Compression Rate for Efficient Vision TransformersMengzhao Chen et.al.2305.17997:mortar_board:Code
2023-05-29Streaming Audio Transformers for Online Audio TaggingHeinrich Dinkel et.al.2305.17834:mortar_board:Code
2023-05-28LowDINO – A Low Parameter Self Supervised Learning ModelSai Krishna Prathapaneni et.al.2305.17791:mortar_board:Code
2023-05-27Vision Transformers for Small Histological Datasets Learned through Knowledge DistillationNeel Kanwal et.al.2305.17370:mortar_board:None
2023-05-27Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention Graph in Pre-Trained TransformersHongjie Wang et.al.2305.17328:mortar_board:None
2023-05-26COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision ModelsJinqi Xiao et.al.2305.17235:mortar_board:Code
2023-05-26Do We Really Need a Large Number of Visual Prompts?Youngeun Kim et.al.2305.17223:mortar_board:None
2023-05-25Making Vision Transformers Truly Shift-EquivariantRenan A. Rojas-Gomez et.al.2305.16316:mortar_board:None
2023-05-25Sharpness-Aware Minimization Leads to Low-Rank FeaturesMaksym Andriushchenko et.al.2305.16292:mortar_board:Code
2023-05-25Multi-scale Efficient Graph-Transformer for Whole Slide Image ClassificationSaisai Ding et.al.2305.15773:mortar_board:None
2023-05-24ViTMatte: Boosting Image Matting with Pretrained Plain Vision TransformersJingfeng Yao et.al.2305.15272:mortar_board:Code
2023-05-24ICDAR 2023 Competition on Robust Layout Segmentation in Corporate DocumentsChristoph Auer et.al.2305.14962:mortar_board:None
2023-05-24Predicting Token Impact Towards Efficient Vision TransformerHong Wang et.al.2305.14840:mortar_board:None
2023-05-24Dual Path Transformer with Partition AttentionZhengkai Jiang et.al.2305.14768:mortar_board:None
2023-05-24BinaryViT: Towards Efficient and Accurate Binary Vision TransformersJunrui Xiao et.al.2305.14730:mortar_board:None
2023-05-24Quantifying Character Similarity with Vision TransformersXinmei Yang et.al.2305.14672:mortar_board:Code
2023-05-24Reinforcement Learning finetuned Vision-Code Transformer for UI-to-Code GenerationDavit Soselia et.al.2305.14637:mortar_board:None
2023-05-23Source-Free Domain Adaptation for RGB-D Semantic Segmentation with Vision TransformersGiulia Rizzoli et.al.2305.14269:mortar_board:None
2023-05-22Efficient Large-Scale Vision Representation LearningEden Dolev et.al.2305.13399:mortar_board:None
2023-05-22U-DiT TTS: U-Diffusion Vision Transformer for Text-to-SpeechXin Jing et.al.2305.13195:mortar_board:None
2023-05-22DeepJSCC-l++: Robust and Bandwidth-Adaptive Wireless Image TransmissionChenghong Bian et.al.2305.13161:mortar_board:None
2023-05-22Getting ViT in Shape: Scaling Laws for Compute-Optimal Model DesignIbrahim Alabdulmohsin et.al.2305.13035:mortar_board:None
2023-05-22HGFormer: Hierarchical Grouping Transformer for Domain Generalized Semantic SegmentationJian Ding et.al.2305.13031:mortar_board:None
2023-05-22Why current rain denoising models fail on CycleGAN created rain images in autonomous drivingMichael Kranl et.al.2305.12983:mortar_board:None
2023-05-22VanillaNet: the Power of Minimalism in Deep LearningHanting Chen et.al.2305.12972:mortar_board:Code
2023-05-22TSPTQ-ViT: Two-scaled post-training quantization for vision transformerYu-Shan Tai et.al.2305.12901:mortar_board:None
2023-05-22Spatiotemporal Attention-based Semantic Compression for Real-time Video RecognitionNan Li et.al.2305.12796:mortar_board:None
2023-05-21Your smartphone could act as a pulse-oximeter and as a single-lead ECGAhsan Mehmood et.al.2305.12583:mortar_board:None
2023-05-21Bi-ViT: Pushing the Limit of Vision Transformer QuantizationYanjing Li et.al.2305.12354:mortar_board:None
2023-05-19Multimodal Web Navigation with Instruction-Finetuned Foundation ModelsHiroki Furuta et.al.2305.11854:mortar_board:None
2023-05-19Surgical-VQLA: Transformer with Gated Vision-Language Embedding for Visual Question Localized-Answering in Robotic SurgeryLong Bai et.al.2305.11692:mortar_board:Code
2023-05-19SurgMAE: Masked Autoencoders for Long Surgical Video AnalysisMuhammad Abdullah Jamal et.al.2305.11451:mortar_board:None
2023-05-18How Deep Learning Sees the World: A Survey on Adversarial Attacks & DefensesJoana C. Costa et.al.2305.10862:mortar_board:None
2023-05-18Boost Vision Transformer with GPU-Friendly Sparsity and QuantizationChong Yu et.al.2305.10727:mortar_board:None
2023-05-17CageViT: Convolutional Activation Guided Efficient Vision TransformerHao Zheng et.al.2305.09924:mortar_board:None
2023-05-17A survey of the Vision Transformers and its CNN-Transformer based VariantsAsifullah Khan et.al.2305.09880:mortar_board:None
2023-05-16Blind Image Quality Assessment via Transformer Predicted Error Map and Perceptual Quality TokenJinsong Shi et.al.2305.09353:mortar_board:Code
2023-05-16CB-HVTNet: A channel-boosted hybrid vision transformer network for lymphocyte assessment in histopathological imagesMomina Liaqat Ali et.al.2305.09211:mortar_board:None
2023-05-16Style Transfer Enabled Sim2Real Framework for Efficient Learning of Robotic Ultrasound Image Analysis Using Simulated DataKeyu Li et.al.2305.09169:mortar_board:None
2023-05-13M2^2DAR: Multi-View Multi-Scale Driver Action Recognition with Vision TransformerYunsheng Ma et.al.2305.08877:mortar_board:Code
2023-05-15AutoRecon: Automated 3D Object Discovery and ReconstructionYuang Wang et.al.2305.08810:mortar_board:None
2023-05-15Enhancing Performance of Vision Transformers on Small Datasets through Local Inductive Bias IncorporationIbrahim Batuhan Akkaya et.al.2305.08551:mortar_board:None
2023-05-15MaxViT-UNet: Multi-Axis Attention for Medical Image SegmentationAbdul Rehman et.al.2305.08396:mortar_board:None
2023-05-14On enhancing the robustness of Vision Transformers: Defensive DiffusionRaza Imam et.al.2305.08031:mortar_board:None
2023-05-13GSB: Group Superposition Binarization for Vision Transformer with Limited Training SamplesTian Gao et.al.2305.07931:mortar_board:Code
2023-05-13Meta-Polyp: a baseline for efficient Polyp segmentationQuoc-Huy Trinh et.al.2305.07848:mortar_board:Code
2023-05-12ViT Unified: Joint Fingerprint Recognition and Presentation Attack DetectionSteven A. Grosz et.al.2305.07602:mortar_board:None
2023-05-11OneCAD: One Classifier for All image Datasets using multimodal learningShakti N. Wadekar et.al.2305.07167:mortar_board:None
2023-05-11Salient Mask-Guided Vision Transformer for Fine-Grained ClassificationDmitry Demidov et.al.2305.07102:mortar_board:Code
2023-05-11EfficientViT: Memory Efficient Vision Transformer with Cascaded Group AttentionXinyu Liu et.al.2305.07027:mortar_board:Code
2023-05-11Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision TransformersDahun Kim et.al.2305.07011:mortar_board:None
2023-05-11Extending Audio Masked Autoencoders Toward Audio RestorationZhi Zhong et.al.2305.06701:mortar_board:None
2023-05-11Undercover Deepfakes: Detecting Fake Segments in VideosSanjay Saha et.al.2305.06564:mortar_board:Code
2023-05-11Patch-wise Mixed-Precision Quantization of Vision TransformerJunrui Xiao et.al.2305.06559:mortar_board:None
2023-05-08Joint Moment Retrieval and Highlight Detection Via Natural Language QueriesRichard Luo et.al.2305.04961:mortar_board:Code
2023-05-08BiRT: Bio-inspired Replay in Vision Transformers for Continual LearningKishaan Jeeveswaran et.al.2305.04769:mortar_board:Code
2023-05-08Understanding Gaussian Attention Bias of Vision Transformers Using Effective Receptive FieldsBum Jun Kim et.al.2305.04722:mortar_board:None
2023-05-08Vision Transformer Off-the-Shelf: A Surprising Baseline for Few-Shot Class-Agnostic CountingZhicheng Wang et.al.2305.04440:mortar_board:None
2023-05-05FM-ViT: Flexible Modal Vision Transformers for Face Anti-SpoofingAjian Liu et.al.2305.03277:mortar_board:None
2023-05-05Semantic Segmentation using Vision Transformers: A surveyHans Thisanke et.al.2305.03273:mortar_board:None
2023-05-04AttentionViz: A Global View of Transformer AttentionCatherine Yeh et.al.2305.03210:mortar_board:None
2023-05-03Real-Time Radiance Fields for Single-Image Portrait View SynthesisAlex Trevithick et.al.2305.02310:mortar_board:None
2023-05-03Learngene: Inheriting Condensed Knowledge from the Ancestry Model to Descendant ModelsQiufeng Wang et.al.2305.02279:mortar_board:None
2023-05-03A Vision Transformer Approach for Efficient Near-Field Irregular SAR Super-ResolutionJosiah Smith et.al.2305.02074:mortar_board:None
2023-05-03“Glitch in the Matrix!”: A Large Scale Benchmark for Content Driven Audio-Visual Forgery Detection and LocalizationZhixi Cai et.al.2305.01979:mortar_board:Code
2023-05-02High-Resolution Synthetic RGB-D Datasets for Monocular Depth EstimationAakash Rajpal et.al.2305.01732:mortar_board:None
2023-05-02ARBEx: Attentive Feature Extraction with Reliability Balancing for Robust Facial Expression LearningAzmine Toushik Wasi et.al.2305.01486:mortar_board:Code
2023-05-02AxWin Transformer: A Context-Aware Vision Transformer Backbone with Axial WindowsFangjian Lin et.al.2305.01280:mortar_board:None
2023-05-02Exploring vision transformer layer choosing for semantic segmentationFangjian Lin et.al.2305.01279:mortar_board:None
2023-05-01What Do Self-Supervised Vision Transformers Learn?Namuk Park et.al.2305.00729:mortar_board:Code
2023-05-01Rethinking Boundary Detection in Deep Learning Models for Medical Image SegmentationYi Lin et.al.2305.00678:mortar_board:Code
2023-04-30Consolidator: Mergeable Adapter with Grouped Connections for Visual AdaptationTianxiang Hao et.al.2305.00603:mortar_board:None
2023-04-28MMViT: Multiscale Multiview Vision TransformersYuchen Liu et.al.2305.00104:mortar_board:None
2023-04-28An automated end-to-end deep learning-based framework for lung cancer diagnosis by detecting and classifying the lung nodulesSamiul Based Shuvo et.al.2305.00046:mortar_board:None
2023-04-28Representation Matters: The Game of Chess Poses a Challenge to Vision TransformersJohannes Czech et.al.2304.14918:mortar_board:None
2023-04-28PreNAS: Preferred One-Shot Learning Towards Efficient Neural Architecture SearchHaibin Wang et.al.2304.14636:mortar_board:None
2023-04-28DIAMANT: Dual Image-Attention Map Encoders For Medical Image SegmentationYousef Yeganeh et.al.2304.14571:mortar_board:None
2023-04-27Vision Conformer: Incorporating Convolutions into Vision Transformer LayersBrian Kenji Iwana et.al.2304.13991:mortar_board:Code
2023-04-26UniNeXt: Exploring A Unified Architecture for Vision RecognitionFangjian Lin et.al.2304.13700:mortar_board:None
2023-04-25Objectives Matter: Understanding the Impact of Self-Supervised Objectives on Vision Transformer RepresentationsShashank Shekhar et.al.2304.13089:mortar_board:None
2023-04-25CompletionFormer: Depth Completion with Convolutions and Vision TransformersZhang Youmin et.al.2304.13030:mortar_board:Code
2023-04-25Hint-Aug: Drawing Hints from Foundation Vision Transformers Towards Boosted Few-Shot Parameter-Efficient TuningZhongzhi Yu et.al.2304.12520:mortar_board:None
2023-04-24Rank Flow Embedding for Unsupervised and Semi-Supervised Manifold LearningLucas Pascotti Valem et.al.2304.12448:mortar_board:Code
2023-04-24Augmentation-based Domain Generalization for Semantic SegmentationManuel Schwonberg et.al.2304.12122:mortar_board:None
2023-04-24MixPro: Data Augmentation with MaskMix and Progressive Attention Labeling for Vision TransformerQihao Zhao et.al.2304.12043:mortar_board:Code
2023-04-24Transformer-based stereo-aware 3D object detection from binocular imagesHanqing Sun et.al.2304.11906:mortar_board:None
2023-04-24Universal Domain Adaptation via Compressive Attention MatchingDidi Zhu et.al.2304.11862:mortar_board:None
2023-04-23Vision Transformer for Efficient Chest X-ray and Gastrointestinal Image ClassificationSmriti Regmi et.al.2304.11529:mortar_board:None
2023-04-22Vision Transformers, a new approach for high-resolution and large-scale mapping of canopy heightsIbrahim Fayad et.al.2304.11487:mortar_board:None
2023-04-22Self-supervised Learning by View SynthesisShaoteng Liu et.al.2304.11330:mortar_board:None
2023-04-21Deep-Learning-based Fast and Accurate 3D CT Deformable Image Registration in Lung CancerYuzhen Ding et.al.2304.11135:mortar_board:None
2023-04-21DeformableFormer: Classification of Endoscopic Ultrasound Guided Fine Needle Biopsy in Pancreatic DiseasesTaiji Kurami et.al.2304.10791:mortar_board:None
2023-04-21Joint Token Pruning and Squeezing Towards More Aggressive Compression of Vision TransformersSiyuan Wei et.al.2304.10716:mortar_board:Code
2023-04-20HM-ViT: Hetero-modal Vehicle-to-Vehicle Cooperative perception with vision transformerHao Xiang et.al.2304.10628:mortar_board:None
2023-04-20Contrastive Tuning: A Little Help to Make Masked Autoencoders ForgetJohannes Lehner et.al.2304.10520:mortar_board:Code
2023-04-19LipsFormer: Introducing Lipschitz Continuity to Vision TransformersXianbiao Qi et.al.2304.09856:mortar_board:Code
2023-04-19Transformer-Based Visual Segmentation: A SurveyXiangtai Li et.al.2304.09854:mortar_board:Code
2023-04-19CMID: A Unified Self-Supervised Learning Framework for Remote Sensing Image UnderstandingDilxat Muhtar et.al.2304.09670:mortar_board:Code
2023-04-19Boosting Semantic Segmentation with Semantic BoundariesHaruya Ishikawa et.al.2304.09427:mortar_board:Code
2023-04-18Fibroglandular Tissue Segmentation in Breast MRI using Vision Transformers – A multi-institutional evaluationGustav Müller-Franzes et.al.2304.08972:mortar_board:Code
2023-04-18AutoTaskFormer: Searching Vision Transformers for Multi-task LearningYang Liu et.al.2304.08756:mortar_board:None
2023-04-17Synthetic Data from Diffusion Models Improves ImageNet ClassificationShekoofeh Azizi et.al.2304.08466:mortar_board:None
2023-04-17Efficient Video Action Detection with Token Dropout and Context RefinementLei Chen et.al.2304.08451:mortar_board:None
2023-04-17Transformer with Selective Shuffled Position Embedding using ROI-Exchange Strategy for Early Detection of Knee OsteoarthritisZhe Wang et.al.2304.08364:mortar_board:None
2023-04-17The Universe is worth 64364^3 pixels: Convolution Neural Network and Vision Transformers for CosmologySe Yeon Hwang et.al.2304.08192:mortar_board:None
2023-04-17ViPLO: Vision Transformer based Pose-Conditioned Self-Loop Graph for Human-Object Interaction DetectionJeeseung Park et.al.2304.08114:mortar_board:None
2023-04-16A Data-Centric Solution to NonHomogeneous Dehazing via Vision TransformerYangyi Liu et.al.2304.07874:mortar_board:Code
2023-04-15MA-ViT: Modality-Agnostic Vision Transformers for Face Anti-SpoofingAjian Liu et.al.2304.07549:mortar_board:None
2023-04-14Uncovering the Inner Workings of STEGO for Safe Unsupervised Semantic SegmentationAlexander Koenig et.al.2304.07314:mortar_board:None
2023-04-14CAD-RADS scoring of coronary CT angiography with Multi-Axis Vision Transformer: a clinically-inspired deep learning pipelineAlessia Gerbasi et.al.2304.07277:mortar_board:None
2023-04-14Sub-meter resolution canopy height maps using self-supervised learning and a vision transformer trained on Aerial and GEDI LidarJamie Tolan et.al.2304.07213:mortar_board:None
2023-04-14Preserving Locality in Vision Transformers for Class Incremental LearningBowen Zheng et.al.2304.06971:mortar_board:None
2023-04-13SpectFormer: Frequency and Attention is what you need in a Vision TransformerBadri N. Patro et.al.2304.06446:mortar_board:None
2023-04-13VISION DIFFMASK: Faithful Interpretation of Vision Transformers with Differentiable Patch MaskingAngelos Nalmpantis et.al.2304.06391:mortar_board:Code
2023-04-13Converting ECG Signals to Images for Efficient Image-text Retrieval via EncodingJielin Qiu et.al.2304.06286:mortar_board:None
2023-04-13RSIR Transformer: Hierarchical Vision Transformer using Random Sampling Windows and Important Region WindowsZhemin Zhang et.al.2304.06250:mortar_board:None
2023-04-12Towards Evaluating Explanations of Vision Transformers for Medical ImagingPiotr Komorowski et.al.2304.06133:mortar_board:Code
2023-04-12RECLIP: Resource-efficient CLIP by Training with Small ImagesRunze Li et.al.2304.06028:mortar_board:None
2023-04-12Rail Detection: An Efficient Row-based Network and A New BenchmarkXinpeng Li et.al.2304.05667:mortar_board:Code
2023-04-12RIFormer: Keep Your Vision Backbone Effective While Removing Token MixerJiahao Wang et.al.2304.05659:mortar_board:None
2023-04-12CLIP Surgery for Better Explainability with Enhancement in Open-Vocabulary TasksYi Li et.al.2304.05653:mortar_board:Code
2023-04-11MC-ViViT: Multi-branch Classifier-ViViT to Detect Mild Cognitive Impairment in Older Adults using Facial VideosJian Sun et.al.2304.05292:mortar_board:None
2023-04-11A Billion-scale Foundation Model for Remote Sensing ImagesKeumgang Cha et.al.2304.05215:mortar_board:None
2023-04-11Open Set Classification of GAN-based Image Manipulations via a ViT-based Hybrid ArchitectureJun Wang et.al.2304.05212:mortar_board:None
2023-04-11WEAR: A Multimodal Dataset for Wearable and Egocentric Video Activity RecognitionMarius Bock et.al.2304.05088:mortar_board:None
2023-04-11Life Regression based Patch Slimming for Vision TransformersJiawei Chen et.al.2304.04926:mortar_board:None
2023-04-10ViT-Calibrator: Decision Stream Calibration for Vision TransformerLin Chen et.al.2304.04354:mortar_board:None
2023-04-09ForamViT-GAN: Exploring New Paradigms in Deep Learning for Micropaleontological Image AnalysisIvan Ferreira-Chacua et.al.2304.04291:mortar_board:None
2023-04-09Slide-Transformer: Hierarchical Vision Transformer with Local Self-AttentionXuran Pan et.al.2304.04237:mortar_board:Code
2023-04-07A Cross-Scale Hierarchical Transformer with Correspondence-Augmented Attention for inferring Bird’s-Eye-View Semantic SegmentationNaiyu Fang et.al.2304.03650:mortar_board:None
2023-04-07PSLT: A Light-weight Vision Transformer with Ladder Self-Attention and Progressive ShiftGaojie Wu et.al.2304.03481:mortar_board:None
2023-04-06R2R^{2}Former: Unified RRetrieval and RReranking Transformer for Place RecognitionSijie Zhu et.al.2304.03410:mortar_board:None
2023-04-06From Saliency to DINO: Saliency-guided Vision Transformer for Few-shot Keypoint DetectionChangsheng Lu et.al.2304.03140:mortar_board:None
2023-04-06InterFormer: Real-time Interactive Image SegmentationYou Huang et.al.2304.02942:mortar_board:Code
2023-04-06Towards an Effective and Efficient Transformer for Rain-by-snow Weather RemovalTao Gao et.al.2304.02860:mortar_board:Code
2023-04-06MULLER: Multilayer Laplacian Resizer for VisionZhengzhong Tu et.al.2304.02859:mortar_board:None
2023-04-05Training Strategies for Vision Transformers for Object DetectionApoorv Singh et.al.2304.02186:mortar_board:None
2023-04-04Strong Baselines for Parameter Efficient Few-Shot Fine-tuningSamyadeep Basu et.al.2304.01917:mortar_board:None
2023-04-04EPVT: Environment-aware Prompt Vision Transformer for Domain Generalization in Skin Lesion RecognitionSiyuan Yan et.al.2304.01508:mortar_board:None
2023-04-04Attention Map Guided Transformer Pruning for Edge DeviceJunzhu Mao et.al.2304.01452:mortar_board:Code
2023-04-03WeakTr: Exploring Plain Vision Transformer for Weakly-supervised Semantic SegmentationLianghui Zhu et.al.2304.01184:mortar_board:Code
2023-04-03ViT-DAE: Transformer-driven Diffusion Autoencoder for Histopathology Image AnalysisXuan Xu et.al.2304.01053:mortar_board:None
2023-04-01Vision Transformers with Mixed-Resolution TokenizationTomer Ronen et.al.2304.00287:mortar_board:Code
2023-03-31Hierarchical Vision Transformers for Cardiac Ejection Fraction EstimationLhuqita Fazry et.al.2304.00177:mortar_board:Code
2023-03-31Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence?Arjun Majumdar et.al.2303.18240:mortar_board:None
2023-03-31LaCViT: A Label-aware Contrastive Training Framework for Vision TransformersZijun Long et.al.2303.18013:mortar_board:None
2023-03-31Exploring the Limits of Deep Image Clustering using Pretrained ModelsNikolas Adaloglou et.al.2303.17896:mortar_board:None
2023-03-31Visual Anomaly Detection via Dual-Attention Transformer and Discriminative FlowHaiming Yao et.al.2303.17882:mortar_board:None
2023-03-31Rethinking Local Perception in Lightweight Vision TransformerQihang Fan et.al.2303.17803:mortar_board:None
2023-03-30If At First You Don’t Succeed: Test Time Re-ranking for Zero-shot, Cross-domain RetrievalFinlay G. C. Hudson et.al.2303.17703:mortar_board:None
2023-03-30Whether and When does Endoscopy Domain Pretraining Make Sense?Dominik Batić et.al.2303.17636:mortar_board:None
2023-03-30SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision TransformerXuanyao Chen et.al.2303.17605:mortar_board:None
2023-03-30MobileInst: Video Instance Segmentation on the MobileRenhong Zhang et.al.2303.17594:mortar_board:None
2023-03-30Streaming Video ModelYucheng Zhao et.al.2303.17228:mortar_board:Code
2023-03-30ImageNet-E: Benchmarking Neural Network Robustness via Attribute EditingXiaodan Li et.al.2303.17096:mortar_board:Code
2023-03-29Visually Wired NFTs: Exploring the Role of Inspiration in Non-Fungible TokensLucio La Cava et.al.2303.17031:mortar_board:None
2023-03-29T-FFTRadNet: Object Detection with Swin Vision Transformers from Raw ADC Radar SignalsJames Giroux et.al.2303.16940:mortar_board:None
2023-03-29Multi-scale Hierarchical Vision Transformer with Cascaded Attention Decoding for Medical Image SegmentationMd Mostafijur Rahman et.al.2303.16892:mortar_board:Code
2023-03-29Self-accumulative Vision Transformer for Bone Age Assessment Using the Sauvegrain MethodHong-Jun Choi et.al.2303.16557:mortar_board:None
2023-03-28ASIC: Aligning Sparse in-the-wild Image CollectionsKamal Gupta et.al.2303.16201:mortar_board:None
2023-03-28Transferable Adversarial Attacks on Vision Transformers with Token Gradient RegularizationJianping Zhang et.al.2303.15754:mortar_board:None
2023-03-28TFS-ViT: Token-Level Feature Stylization for Domain GeneralizationMehrdad Noori et.al.2303.15698:mortar_board:Code
2023-03-27Learning Expressive Prompting With Residuals for Vision TransformersRajshekhar Das et.al.2303.15591:mortar_board:None
2023-03-27Core-Periphery Principle Guided Redesign of Self-Attention in TransformersXiaowei Yu et.al.2303.15569:mortar_board:None
2023-03-27MoViT: Memorizing Vision Transformers for Medical Image AnalysisYiqing Shen et.al.2303.15553:mortar_board:None
2023-03-24Image Deblurring by Exploring In-depth Properties of TransformerPengwei Liang et.al.2303.15198:mortar_board:None
2023-03-27Vision Transformer with Quadrangle AttentionQiming Zhang et.al.2303.15105:mortar_board:Code
2023-03-27Leveraging Hidden Positives for Unsupervised Semantic SegmentationHyun Seok Seong et.al.2303.15014:mortar_board:Code
2023-03-27Transformer-based Multi-Instance Learning for Weakly Supervised Object DetectionZhaofei Wang et.al.2303.14999:mortar_board:None
2023-03-26Feature Shrinkage Pyramid for Camouflaged Object Detection with TransformersZhou Huang et.al.2303.14816:mortar_board:Code
2023-03-26Contrastive Transformer: Contrastive Learning Scheme with Transformer innate PatchesSander Riisøen Jyhne et.al.2303.14806:mortar_board:None
2023-03-25Prompt-Guided Transformers for End-to-End Open-Vocabulary Object DetectionHwanjun Song et.al.2303.14386:mortar_board:None
2023-03-25Multi-view knowledge distillation transformer for human action recognitionYing-Chen Lin et.al.2303.14358:mortar_board:None
2023-03-25Towards Accurate Post-Training Quantization for Vision TransformerYifu Ding et.al.2303.14341:mortar_board:None
2023-03-24FastViT: A Fast Hybrid Vision Transformer using Structural ReparameterizationPavan Kumar Anasosalu Vasu et.al.2303.14189:mortar_board:None
2023-03-24Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision TransformersCong Wei et.al.2303.13755:mortar_board:None
2023-03-24How Does Attention Work in Vision Transformers? A Visual Analytics AttemptYiran Li et.al.2303.13731:mortar_board:None
2023-03-23Scaled Quantization for the Vision TransformerYangyang Chang et.al.2303.13601:mortar_board:None
2023-03-23Patch-Mix Transformer for Unsupervised Domain Adaptation: A Game PerspectiveJinjing Zhu et.al.2303.13434:mortar_board:None
2023-03-23A Permutable Hybrid Network for Volumetric Medical Image SegmentationYi Lin et.al.2303.13111:mortar_board:None
2023-03-23MMFormer: Multimodal Transformer Using Multiscale Self-Attention for Remote Sensing Image ClassificationBo Zhang et.al.2303.13101:mortar_board:None
2023-03-23Top-Down Visual Attention from Analysis by SynthesisBaifeng Shi et.al.2303.13043:mortar_board:None
2023-03-23MonoATT: Online Monocular 3D Object Detection with Adaptive Token TransformerYunsong Zhou et.al.2303.13018:mortar_board:None
2023-03-22TRON: Transformer Neural Network Acceleration with Non-Coherent Silicon PhotonicsSalma Afifi et.al.2303.12914:mortar_board:None
2023-03-22Q-HyViT: Post-Training Quantization for Hybrid Vision Transformer with Bridge Block ReconstructionJemin Lee et.al.2303.12557:mortar_board:None
2023-03-22Multiscale Attention via Wavelet Neural Operators for Vision TransformersAnahita Nekoozadeh et.al.2303.12398:mortar_board:None
2023-03-21Machine Learning for Brain Disorders: Transformers and Visual TransformersRobin Courant et.al.2303.12068:mortar_board:None
2023-03-18Vision Transformer-based Model for Severity Quantification of Lung Pneumonia Using Chest X-ray ImagesBouthaina Slika et.al.2303.11935:mortar_board:None
2023-03-21The Multiscale Surface Vision TransformerSimon Dahan et.al.2303.11909:mortar_board:Code
2023-03-21CLIP-ReIdent: Contrastive Training for Player Re-IdentificationKonrad Habel et.al.2303.11855:mortar_board:None
2023-03-20Towards Better 3D Knowledge Transfer via Masked Image Modeling for Multi-view 3D UnderstandingJihao Liu et.al.2303.11325:mortar_board:None
2023-03-20Robustifying Token Attention for Vision TransformersYong Guo et.al.2303.11126:mortar_board:None
2023-03-17LION: Implicit Vision Prompt TuningHaixin Wang et.al.2303.09992:mortar_board:None
2023-03-16Vision Transformer for Action Units DetectionTu Vu et.al.2303.09917:mortar_board:None
2023-03-16Rehearsal-Free Domain Continual Face Anti-Spoofing: Generalize More and Forget LessRizhao Cai et.al.2303.09914:mortar_board:None
2023-03-17Dual-path Adaptation from Image to Video TransformersJungin Park et.al.2303.09857:mortar_board:Code
2023-03-17Denoising Diffusion Autoencoders are Unified Self-supervised LearnersWeilai Xiang et.al.2303.09769:mortar_board:None
2023-03-17ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile DevicesChen Tang et.al.2303.09730:mortar_board:None
2023-03-15ViTO: Vision Transformer-OperatorOded Ovadia et.al.2303.08891:mortar_board:None
2023-03-15DeepMIM: Deep Supervision for Masked Image ModelingSucheng Ren et.al.2303.08817:mortar_board:Code
2023-03-15BiFormer: Vision Transformer with Bi-Level Routing AttentionLei Zhu et.al.2303.08810:mortar_board:Code
2023-03-15Query-guided Attention in Vision Transformers for Localizing Objects Using a Single SketchAditay Tripathi et.al.2303.08784:mortar_board:None
2023-03-15Making Vision Transformers Efficient from A Token Sparsification ViewShuning Chang et.al.2303.08685:mortar_board:None
2023-03-14Learning to Grow Artificial Hippocampi in Vision Transformers for Resilient Lifelong LearningChinmay Savadikar et.al.2303.08250:mortar_board:None
2023-03-14Efficiently Training Vision Transformers on Structural MRI Scans for Alzheimer’s Disease DetectionNikhil J. Dhinagar et.al.2303.08216:mortar_board:None
2023-03-14Quaternion Orthogonal Transformer for Facial Expression Recognition in the WildYu Zhou et.al.2303.07831:mortar_board:Code
2023-03-14OVRL-V2: A simple state-of-art baseline for ImageNav and ObjectNavKarmesh Yadav et.al.2303.07798:mortar_board:None
2023-03-14CAT: Causal Audio Transformer for Audio ClassificationXiaoyu Liu et.al.2303.07626:mortar_board:None
2023-03-14AdPE: Adversarial Positional Embeddings for Pretraining Vision Transformers via MAE+Xiao Wang et.al.2303.07598:mortar_board:Code
2023-03-14WDiscOOD: Out-of-Distribution Detection via Whitened Linear Discriminative AnalysisYiye Chen et.al.2303.07543:mortar_board:None
2023-03-13Pretrained ViTs Yield Versatile Representations For Medical ImagesChristos Matsoukas et.al.2303.07034:mortar_board:Code
2023-03-13CrossFormer++: A Versatile Vision Transformer Hinging on Cross-scale AttentionWenxiao Wang et.al.2303.06908:mortar_board:Code
2023-03-13ST360IQ: No-Reference Omnidirectional Image Quality Assessment with Spherical Vision TransformersNafiseh Jabbari Tofighi et.al.2303.06907:mortar_board:Code
2023-03-13Three Guidelines You Should Know for Universally Slimmable Self-Supervised LearningYun-Hao Cao et.al.2303.06870:mortar_board:Code
2023-03-11Token Sparsification for Faster Medical Image SegmentationLei Zhou et.al.2303.06522:mortar_board:Code
2023-03-11Xformer: Hybrid X-Shaped Transformer for Image DenoisingJiale Zhang et.al.2303.06440:mortar_board:None
2023-03-11Stabilizing Transformer Training by Preventing Attention Entropy CollapseShuangfei Zhai et.al.2303.06296:mortar_board:None
2023-03-10Contrastive Language-Image Pretrained (CLIP) Models are Powerful Out-of-Distribution DetectorsFelix Michels et.al.2303.05828:mortar_board:None
2023-03-10Scaling Up 3D Kernels with Bayesian Frequency Re-parameterization for Medical Image SegmentationHo Hin Lee et.al.2303.05785:mortar_board:None
2023-03-10Human Pose Estimation from Ambiguous Pressure Recordings with Spatio-temporal Masked TransformersVandad Davoodnia et.al.2303.05691:mortar_board:None
2023-03-08UT-Net: Combining U-Net and Transformer for Joint Optic Disc and Cup Segmentation and Glaucoma DetectionRukhshanda Hussain et.al.2303.04939:mortar_board:None
2023-03-08X-Pruner: eXplainable Pruning for Vision TransformersLu Yu et.al.2303.04935:mortar_board:None
2023-03-08Centroid-centered Modeling for Efficient Vision Transformer Pre-trainingXin Yan et.al.2303.04664:mortar_board:None
2023-03-08HyT-NAS: Hybrid Transformers Neural Architecture Search for Edge DevicesLotfi Abdelkrim Mecharbat et.al.2303.04440:mortar_board:None
2023-03-08SGDViT: Saliency-Guided Dynamic Vision Transformer for UAV TrackingLiangliang Yao et.al.2303.04378:mortar_board:Code
2023-03-08SANDFORMER: CNN and Transformer under Gated Fusion for Sand Dust Image RestorationJun Shi et.al.2303.04365:mortar_board:None
2023-03-07Prediction of transonic flow over supercritical airfoils using geometric-encoding and deep-learning strategiesZhiwen Deng et.al.2303.03695:mortar_board:None
2023-03-07Weakly Supervised Caveline Detection For AUV Navigation Inside Underwater CavesBoxiao Yu et.al.2303.03670:mortar_board:None
2023-03-07PreFallKD: Pre-Impact Fall Detection via CNN-ViT Knowledge DistillationTin-Han Chi et.al.2303.03634:mortar_board:None
2023-03-06ST-KeyS: Self-Supervised Transformer for Keyword Spotting in Historical Handwritten DocumentsSana Khamekhem Jemni et.al.2303.03127:mortar_board:None
2023-03-06UniHCP: A Unified Model for Human-Centric PerceptionsYuanzheng Ci et.al.2303.02936:mortar_board:None
2023-03-04A Fast Training-Free Compression Framework for Vision TransformersJung Hwan Heo et.al.2303.02331:mortar_board:Code
2023-03-05DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural NetworkXuan Shen et.al.2303.02165:mortar_board:Code
2023-03-03Retinal Image Restoration using Transformer and Cycle-Consistent Generative Adversarial NetworkAlnur Alimanov et.al.2303.01939:mortar_board:Code
2023-03-03Attention-based Saliency Maps Improve Interpretability of Pneumothorax ClassificationAlessandro Wollek et.al.2303.01871:mortar_board:None
2023-03-02Self-attention in Vision Transformers Performs Perceptual Grouping, Not AttentionParia Mehrani et.al.2303.01542:mortar_board:None
2023-03-02Image as Set of PointsXu Ma et.al.2303.01494:mortar_board:Code
2023-03-02Token Contrast for Weakly-Supervised Semantic SegmentationLixiang Ru et.al.2303.01267:mortar_board:Code
2023-03-02Visual Atoms: Pre-training Vision Transformers with Sinusoidal WavesSora Takashima et.al.2303.01112:mortar_board:None
2023-03-02Learning to Grow Pretrained Models for Efficient Transformer TrainingPeihao Wang et.al.2303.00980:mortar_board:None
2023-03-02Enhancing General Face Forgery Detection via Vision Transformer with Low-Rank AdaptationChenqi Kong et.al.2303.00917:mortar_board:None
2023-03-01AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel ImagesRamin Nakhli et.al.2303.00865:mortar_board:Code
2023-02-28Generic-to-Specific Distillation of Masked AutoencodersWei Huang et.al.2302.14771:mortar_board:Code
2023-02-28Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D PriorsJi Hou et.al.2302.14746:mortar_board:None
2023-02-28DC-Former: Diverse and Compact Transformer for Person Re-IdentificationWen Li et.al.2302.14335:mortar_board:Code
2023-02-28Rethink Long-tailed Recognition with Vision TransformsZhengzhuo Xu et.al.2302.14284:mortar_board:None
2023-02-28Augmented Transformers with Adaptive n-grams Embedding for Multilingual Scene Text RecognitionXueming Yan et.al.2302.14261:mortar_board:None
2023-02-28Remote Sensing Scene Classification with Masked Image Modeling (MIM)Liya Wang et.al.2302.14256:mortar_board:None
2023-02-27UMIFormer: Mining the Correlations between Similar Tokens for Multi-View 3D ReconstructionZhenwei Zhu et.al.2302.13987:mortar_board:None
2023-02-27Spatially-Adaptive Feature Modulation for Efficient Image Super-ResolutionLong Sun et.al.2302.13800:mortar_board:Code
2023-02-26Autonomous Intelligent Navigation for Flexible Endoscopy Using Monocular Depth Guidance and 3-D Shape PlanningYiang Lu et.al.2302.13219:mortar_board:None
2023-02-24Amortised Invariance Learning for Contrastive Self-SupervisionRuchika Chavhan et.al.2302.12712:mortar_board:None
2023-02-24A Convolutional Vision Transformer for Semantic Segmentation of Side-Scan Sonar DataHayat Rajani et.al.2302.12416:mortar_board:Code
2023-02-23Boosting Adversarial Transferability using Dynamic CuesMuzammal Naseer et.al.2302.12252:mortar_board:None
2023-02-23StudyFormer : Attention-Based and Dynamic Multi View Classifier for X-ray imagesLucas Wannenmacher et.al.2302.11840:mortar_board:None
2023-02-22Magnification Invariant Medical Image Analysis: A Comparison of Convolutional Networks, Vision Transformers, and Token MixersPranav Jeevan et.al.2302.11488:mortar_board:None
2023-02-22Transformer-Based Sensor Fusion for Autonomous Driving: A SurveyApoorv Singh et.al.2302.11481:mortar_board:None
2023-02-22Human MotionFormer: Transferring Human Motions with Vision TransformersHongyu Liu et.al.2302.11306:mortar_board:None
2023-02-22A residual dense vision transformer for medical image super-resolution with segmentation-based perceptual loss fine-tuningJin Zhu et.al.2302.11184:mortar_board:None
2023-02-22Deep Active Learning in the Presence of Label Noise: A SurveyMoseli Mots’oehli et.al.2302.11075:mortar_board:None
2023-02-21SF2Former: Amyotrophic Lateral Sclerosis Identification From Multi-center MRI Data Using Spatial and Frequency Fusion TransformerRafsanjany Kushol et.al.2302.10859:mortar_board:Code
2023-02-21Bokeh Rendering Based on Adaptive Depth Calibration NetworkLu Liu et.al.2302.10808:mortar_board:None
2023-02-21MaskedKD: Efficient Distillation of Vision Transformers with Masked ImagesSeungwoo Son et.al.2302.10494:mortar_board:None
2023-02-21ApproxABFT: Approximate Algorithm-Based Fault Tolerance for Vision TransformersXinghua Xue et.al.2302.10469:mortar_board:None
2023-02-21Reliability Analysis of Vision TransformersXinghua Xue et.al.2302.10468:mortar_board:None
2023-02-21Time to Embrace Natural Language Processing (NLP)-based Digital Pathology: Benchmarking NLP- and Convolutional Neural Network-based Deep Learning PipelinesMin Cen et.al.2302.10406:mortar_board:None
2023-02-19MedViT: A Robust Vision Transformer for Generalized Medical Image ClassificationOmid Nejati Manzari et.al.2302.09462:mortar_board:None
2023-02-18VITAL: Vision Transformer Neural Networks for Accurate Smartphone Heterogeneity Resilient Indoor LocalizationDanish Gufran et.al.2302.09443:mortar_board:None
2023-02-18Hyneter: Hybrid Network Transformer for Object DetectionDong Chen et.al.2302.09365:mortar_board:None
2023-02-18Meta Style Adversarial Training for Cross-Domain Few-Shot LearningYuqian Fu et.al.2302.09309:mortar_board:None
2023-02-17ViTA: A Vision Transformer Inference Accelerator for Edge ApplicationsShashank Nag et.al.2302.09108:mortar_board:None
2023-02-17MCAE: Masked Contrastive Autoencoder for Face Anti-SpoofingTianyi Zheng et.al.2302.08674:mortar_board:None
2023-02-16Efficiency 360: Efficient Vision TransformersBadri N. Patro et.al.2302.08374:mortar_board:Code
2023-02-16TcGAN: Semantic-Aware and Structure-Preserved GANs with Individual Vision Transformer for Fast Arbitrary One-Shot Image GenerationYunliang Jiang et.al.2302.08047:mortar_board:None
2023-02-15TFormer: A Transmission-Friendly ViT Model for IoT DevicesZhichao Lu et.al.2302.07734:mortar_board:None
2023-02-14Robust Representation Learning with Self-Distillation for Domain GeneralizationAnkur Singh et.al.2302.06874:mortar_board:None
2023-02-14DiffFashion: Reference-based Fashion Design with Structure-aware Transfer by Diffusion ModelsShidong Cao et.al.2302.06826:mortar_board:Code
2023-02-13A Comprehensive Study of Modern Architectures and Regularization Approaches on CheXpert5000Sontje Ihler et.al.2302.06684:mortar_board:None
2023-02-12A Theoretical Understanding of shallow Vision Transformers: Learning, Generalization, and Sample ComplexityHongkang Li et.al.2302.06015:mortar_board:None
2023-02-12Self-supervised Pseudo-colorizing of Masked CellsRoyden Wagner et.al.2302.05968:mortar_board:Code
2023-02-12Generalized Few-Shot Continual Learning with Contrastive Mixture of AdaptersYawen Cui et.al.2302.05936:mortar_board:Code
2023-02-11Synaptic Stripping: How Pruning Can Bring Dead Neurons Back To LifeTim Whitaker et.al.2302.05818:mortar_board:None
2023-02-11Rethinking Vision Transformer and Masked Autoencoder in Multimodal Face Anti-SpoofingZitong Yu et.al.2302.05744:mortar_board:None
2023-02-10Scaling Vision Transformers to 22 Billion ParametersMostafa Dehghani et.al.2302.05442:mortar_board:None
2023-02-09Reversible Vision TransformersKarttikeya Mangalam et.al.2302.04869:mortar_board:Code
2023-02-09IH-ViT: Vision Transformer-based Integrated Circuit Appear-ance Defect DetectionXiaoibin Wang et.al.2302.04521:mortar_board:None
2023-02-08Adapting Pre-trained Vision Transformers from 2D to 3D through Weight Inflation Improves Medical Image SegmentationYuhui Zhang et.al.2302.04303:mortar_board:Code
2023-02-08Predicting Thrombectomy Recanalization from CT Imaging Using Deep Learning ModelsHaoyue Zhang et.al.2302.04143:mortar_board:None
2023-02-08Cross-Layer Retrospective Retrieving via Layer AttentionYanwen Fang et.al.2302.03985:mortar_board:Code
2023-02-08SwinCross: Cross-modal Swin Transformer for Head-and-Neck Tumor Segmentation in PET/CT ImagesGary Y. Li et.al.2302.03861:mortar_board:None
2023-02-07Understanding Why ViT Trains Badly on Small Datasets: An Intuitive PerspectiveHaoran Zhu et.al.2302.03751:mortar_board:Code
2023-02-07Deep Class-Incremental Learning: A SurveyDa-Wei Zhou et.al.2302.03648:mortar_board:Code
2023-02-06Spatial Functa: Scaling Functa to ImageNet Classification and GenerationMatthias Bauer et.al.2302.03130:mortar_board:None
2023-02-06AIM: Adapting Image Models for Efficient Video Action RecognitionTaojiannan Yang et.al.2302.03024:mortar_board:None
2023-02-06V1T: large-scale mouse V1 response prediction using a Vision TransformerBryan M. Li et.al.2302.03023:mortar_board:None
2023-02-04Oscillation-free Quantization for Low-bit Vision TransformersShih-Yang Liu et.al.2302.02210:mortar_board:None
2023-02-04Knowledge Distillation in Vision Transformers: A Critical ReviewGousia Habib et.al.2302.02108:mortar_board:None
2023-02-03DilateFormer: Multi-Scale Dilated Transformer for Visual RecognitionJiayu Jiao et.al.2302.01791:mortar_board:Code
2023-02-02Fast, Differentiable and Sparse Top-k: a Convex Analysis PerspectiveMichael E. Sander et.al.2302.01425:mortar_board:None
2023-02-02Dual PatchNormManoj Kumar et.al.2302.01327:mortar_board:None
2023-02-02Mnemosyne: Learning to Train Transformers with TransformersDeepali Jain et.al.2302.01128:mortar_board:None
2023-02-02LesionAid: Vision Transformers-based Skin Lesion Generation and ClassificationGhanta Sai Krishna et.al.2302.01104:mortar_board:None
2023-02-02Vision Transformer-based Feature Extraction for Generalized Zero-Shot LearningJiseob Kim et.al.2302.00875:mortar_board:None
2023-02-01Efficient Scopeformer: Towards Scalable and Rich Feature Extraction for Intracranial Hemorrhage DetectionYassine Barhoumi et.al.2302.00220:mortar_board:None
2023-01-31Real Estate Property Valuation using Self-Supervised Vision TransformersMahdieh Yazdani et.al.2302.00117:mortar_board:None
2023-01-31Fairness-aware Vision Transformer via Debiased Self-AttentionYao Qiang et.al.2301.13803:mortar_board:None
2023-01-31Inference Time Evidences of Adversarial Attacks for Forensic on TransformersHugo Lemarchant et.al.2301.13356:mortar_board:None
2023-01-30SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic SegmentationQiang Wan et.al.2301.13156:mortar_board:Code
2023-01-30DepGraph: Towards Any Structural PruningGongfan Fang et.al.2301.12900:mortar_board:Code
2023-01-29Graph Mixer NetworksAhmet Sarıgün et.al.2301.12493:mortar_board:Code
2023-01-29Towards Verifying the Geometric Robustness of Large-scale Neural NetworksFu Wang et.al.2301.12456:mortar_board:Code
2023-01-29PhaVIP: Phage VIrion Protein classification based on chaos game representation and Vision TransformerJiayu Shang et.al.2301.12422:mortar_board:Code
2023-01-29Towards Vision Transformer Unrolling Fixed-Point Algorithm: a Case Study on Image RestorationPeng Qiao et.al.2301.12332:mortar_board:None
2023-01-28Aerial Image Object Detection With Vision Transformer Detector (ViTDet)Liya Wang et.al.2301.12058:mortar_board:None
2023-01-27Voting from Nearest Tasks: Meta-Vote Pruning of Pre-trained Models for Downstream TasksHaiyan Zhao et.al.2301.11560:mortar_board:None
2023-01-27Robust Transformer with Locality Inductive Bias and Feature NormalizationOmid Nejati Manzari et.al.2301.11553:mortar_board:None
2023-01-26Compact Transformer Tracker with Correlative Masked ModelingZikai Song et.al.2301.10938:mortar_board:Code
2023-01-26Facial Emotion RecognitionArpita Vats et.al.2301.10906:mortar_board:None
2023-01-25Out of Distribution Performance of State of Art Vision ModelMd Salman Rahman et.al.2301.10750:mortar_board:None
2023-01-25Connecting metrics for shape-texture knowledge in computer visionTiago Oliveira et.al.2301.10608:mortar_board:None
2023-01-24RangeViT: Towards Vision Transformers for 3D Semantic Segmentation in Autonomous DrivingAngelika Ando et.al.2301.10222:mortar_board:None
2023-01-24Model soups to increase inference without increasing compute timeCharles Dansereau et.al.2301.10092:mortar_board:Code
2023-01-23Combined Use of Federated Learning and Image Encryption for Privacy-Preserving Image Classification with Vision TransformerTeru Nagamori et.al.2301.09255:mortar_board:None
2023-01-20Holistically Explainable Vision TransformersMoritz Böhle et.al.2301.08669:mortar_board:None
2023-01-20Image Memorability Prediction with Vision TransformersThomas Hagen et.al.2301.08647:mortar_board:None
2023-01-19Self-Supervised Learning from Images with a Joint-Embedding Predictive ArchitectureMahmoud Assran et.al.2301.08243:mortar_board:None
2023-01-18ViT-AE++: Improving Vision Transformer Autoencoder for Self-supervised Medical Image RepresentationsChinmay Prabhakar et.al.2301.07382:mortar_board:None
2023-01-17Long Range Pooling for 3D Large-Scale Scene UnderstandingXiang-Li Li et.al.2301.06962:mortar_board:None
2023-01-16Flow imaging as an alternative to pressure transducers through vision transformers and convolutional neural networksRenato F. Miotto et.al.2301.06410:mortar_board:None
2023-01-15TextileNet: A Material Taxonomy-based Fashion Textile DatasetShu Zhong et.al.2301.06160:mortar_board:Code
2023-01-13Efficient Activation Function Optimization through Surrogate ModelingGarrett Bingham et.al.2301.05785:mortar_board:Code
2023-01-13GOHSP: A Unified Framework of Graph and Optimization-based Heterogeneous Structured Pruning for Vision TransformerMiao Yin et.al.2301.05345:mortar_board:None
2023-01-12ViTs for SITS: Vision Transformers for Satellite Image Time SeriesMichail Tarasiou et.al.2301.04944:mortar_board:None
2023-01-11Head-Free Lightweight Semantic Segmentation with Linear TransformerBo Dong et.al.2301.04648:mortar_board:Code
2023-01-11Dynamic Background Reconstruction via Transformer for Infrared Small Target DetectionJingchao Peng et.al.2301.04497:mortar_board:None
2023-01-11Deep Learning Model with Attention Mechanism for Super-resolution of Wireless Channel CharacteristicsHaoyang Zhang et.al.2301.04479:mortar_board:None
2023-01-10Vision Transformers Are Good Mask Auto-LabelersShiyi Lan et.al.2301.03992:mortar_board:None
2023-01-10Dynamic Grained Encoder for Vision TransformersLin Song et.al.2301.03831:mortar_board:Code
2023-01-09Advances in Medical Image Analysis with Vision Transformers: A Comprehensive ReviewReza Azad et.al.2301.03505:mortar_board:Code
2023-01-08STPrivacy: Spatio-Temporal Tubelet Sparsification and Anonymization for Privacy-preserving Action RecognitionMing Li et.al.2301.03046:mortar_board:None
2023-01-06Exploring Efficient Few-shot Adaptation for Vision TransformersChengming Xu et.al.2301.02419:mortar_board:Code
2023-01-05Skip-Attention: Improving Vision Transformers by Paying Less AttentionShashanka Venkataramanan et.al.2301.02240:mortar_board:None
2023-01-05MS-DINO: Efficient Distributed Training of Vision Transformer Foundation Model in Medical Domain through Masked SamplingSangjoon Park et.al.2301.02064:mortar_board:None
2023-01-05Enabling Augmented Segmentation and Registration in Ultrasound-Guided Spinal Surgery via Realistic Ultrasound Synthesis from Diagnostic CT VolumeAng Li et.al.2301.01940:mortar_board:None
2023-01-04Semi-MAE: Masked Autoencoders for Semi-supervised Vision TransformersHaojie Yu et.al.2301.01431:mortar_board:None
2023-01-03Explainability and Robustness of Deep Visual Classification ModelsJindong Gu et.al.2301.01343:mortar_board:None
2023-01-03TinyMIM: An Empirical Study of Distilling MIM Pre-trained ModelsSucheng Ren et.al.2301.01296:mortar_board:Code
2023-01-03A New Perspective to Boost Vision Transformer for Medical Image ClassificationYuexiang Li et.al.2301.00989:mortar_board:None
2023-01-03Detecting Severity of Diabetic Retinopathy from Fundus Images using Ensembled TransformersChandranath Adak et.al.2301.00973:mortar_board:None
2023-01-02Lightweight Image Inpainting by Stripe Window Transformer with Joint Attention to CNNTsung-Jung Liu et.al.2301.00553:mortar_board:Code
2023-01-01Goal-guided Transformer-enabled Reinforcement Learning for Efficient Autonomous NavigationWenhui Huang et.al.2301.00362:mortar_board:None
2022-12-29AttEntropy: Segmenting Unknown Objects in Complex Scenes using the Spatial Attention Entropy of Semantic Segmentation TransformersKrzysztof Lis et.al.2212.14397:mortar_board:None
2022-12-28RevealED: Uncovering Pro-Eating Disorder Content on Twitter Using Deep LearningJonathan Feldman et.al.2212.13949:mortar_board:None
2022-12-28Exploring Vision Transformers as Diffusion LearnersHe Cao et.al.2212.13771:mortar_board:None
2022-12-28OVO: One-shot Vision Transformer Search with Online distillationZimian Wei et.al.2212.13766:mortar_board:None
2022-12-28Representation Separation for Semantic Segmentation with Vision TransformersYuanduo Hong et.al.2212.13764:mortar_board:None
2022-12-27Semi-supervised multiscale dual-encoding method for faulty traffic data detectionYongcan Huang et.al.2212.13596:mortar_board:None
2022-12-26SMMix: Self-Motivated Image Mixing for Vision TransformersMengzhao Chen et.al.2212.12977:mortar_board:Code
2022-12-23A Close Look at Spatial Modeling: From Attention to ConvolutionXu Ma et.al.2212.12552:mortar_board:Code
2022-12-23PanoViT: Vision Transformer for Room Layout Estimation from a Single Panoramic ImageWeichao Shen et.al.2212.12156:mortar_board:None
2022-12-21What Makes for Good Tokenizers in Vision Transformer?Shengju Qian et.al.2212.11115:mortar_board:None
2022-12-21Investigation of Network Architecture for Multimodal Head-and-Neck Tumor SegmentationYe Li et.al.2212.10724:mortar_board:None
2022-12-20Visual Transformers for Primates Classification and Covid DetectionSteffen Illium et.al.2212.10093:mortar_board:None
2022-12-20Conditioned Generative Transformers for Histopathology Image Synthetic AugmentationMeng Li et.al.2212.09977:mortar_board:None
2022-12-16Rethinking Cooking State Recognition with Vision TransformersAkib Mohammed Khan et.al.2212.08586:mortar_board:None
2022-12-16Morphological Classification of Radio Galaxies with wGAN-supported AugmentationLennart Rustige et.al.2212.08504:mortar_board:Code
2022-12-16RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision TransformersZhikai Li et.al.2212.08254:mortar_board:None
2022-12-15Rethinking Vision Transformers for MobileNet Size and SpeedYanyu Li et.al.2212.08059:mortar_board:Code
2022-12-15FlexiViT: One Model for All Patch SizesLucas Beyer et.al.2212.08013:mortar_board:Code
2022-12-15Vision Transformers are Parameter-Efficient Audio-Visual LearnersYan-Bo Lin et.al.2212.07983:mortar_board:Code
2022-12-15Full Contextual Attention for Multi-resolution Transformers in Semantic SegmentationLoic Themyr et.al.2212.07890:mortar_board:None
2022-12-15Detecting Bone Lesions in X-Ray Under Diverse Acquisition ConditionsTal Zimbalist et.al.2212.07792:mortar_board:None
2022-12-13GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group PropagationChenhongyi Yang et.al.2212.06795:mortar_board:Code
2022-12-13What do Vision Transformers Learn? A Visual ExplorationAmin Ghiasi et.al.2212.06727:mortar_board:Code
2022-12-13OAMixer: Object-aware Mixing Layer for Vision TransformersHyunwoo Kang et.al.2212.06595:mortar_board:Code
2022-12-12You Only Need a Good Embeddings Extractor to Fix Spurious CorrelationsRaghav Mehta et.al.2212.06254:mortar_board:None
2022-12-12Masked autoencoders are effective solution to transformer data-hungryJiawei Mao et.al.2212.05677:mortar_board:Code
2022-12-11Recurrent Vision Transformers for Object Detection with Event CamerasMathias Gehrig et.al.2212.05598:mortar_board:None
2022-12-11PromptCAL: Contrastive Affinity Learning via Auxiliary Prompts for Generalized Novel Category DiscoverySheng Zhang et.al.2212.05590:mortar_board:Code
2022-12-11Vision Transformer with Attentive Pooling for Robust Facial Expression RecognitionFanglei Xue et.al.2212.05463:mortar_board:None
2022-12-10Position Embedding Needs an Independent Layer NormalizationRunyi Yu et.al.2212.05262:mortar_board:None
2022-12-09Sparse Upcycling: Training Mixture-of-Experts from Dense CheckpointsAran Komatsuzaki et.al.2212.05055:mortar_board:Code
2022-12-09AugNet: Dynamic Test-Time Augmentation via Differentiable FunctionsShohei Enomoto et.al.2212.04681:mortar_board:None
2022-12-09Mitigation of Spatial Nonstationarity with Vision TransformersLei Liu et.al.2212.04633:mortar_board:None
2022-12-07ViTPose+: Vision Transformer Foundation Model for Generic Body Pose EstimationYufei Xu et.al.2212.04246:mortar_board:Code
2022-12-08Group Generalized Mean Pooling for Vision TransformerByungsoo Ko et.al.2212.04114:mortar_board:None
2022-12-07Multimodal Vision Transformers with Forced Attention for Behavior AnalysisTanay Agrawal et.al.2212.03968:mortar_board:None
2022-12-07Teaching Matters: Investigating the Role of Supervision in Vision TransformersMatthew Walmer et.al.2212.03862:mortar_board:Code
2022-12-06Visual Query Tuning: Towards Effective Usage of Intermediate Representations for Parameter and Memory Efficient Transfer LearningCheng-Hao Tu et.al.2212.03220:mortar_board:None
2022-12-06FacT: Factor-Tuning for Lightweight Adaptation on Vision TransformerShibo Jie et.al.2212.03145:mortar_board:Code
2022-12-06Event-based Monocular Dense Depth Estimation with Recurrent TransformersXu Liu et.al.2212.02791:mortar_board:None
2022-12-06Semantic-aware Message Broadcasting for Efficient Unsupervised Domain AdaptationXin Li et.al.2212.02739:mortar_board:Code
2022-12-06Enabling and Accelerating Dynamic Vision Transformer Inference for Real-Time ApplicationsKavya Sreedhar et.al.2212.02687:mortar_board:None
2022-12-053D-LatentMapper: View Agnostic Single-View Reconstruction of 3D ShapesAlara Dirik et.al.2212.02184:mortar_board:None
2022-12-05Learning Imbalanced Data with Vision TransformersZhengzhuo Xu et.al.2212.02015:mortar_board:Code
2022-12-03Exploring Stochastic Autoregressive Image Modeling for Visual RepresentationYu Qi et.al.2212.01610:mortar_board:Code
2022-12-01ResFormer: Scaling ViTs with Multi-Resolution TrainingRui Tian et.al.2212.00776:mortar_board:None
2022-11-29Transformer-based Hand Gesture Recognition via High-Density EMG Signals: From Instantaneous Recognition to Fusion of Motor Unit Spike TrainsMansooreh Montazerin et.al.2212.00743:mortar_board:None
2022-11-30Part-based Face Recognition with Vision TransformersZhonglin Sun et.al.2212.00057:mortar_board:None
2022-11-29Finding Differences Between Transformers and ConvNets Using Counterfactual Simulation TestingNataniel Ruiz et.al.2211.16499:mortar_board:None
2022-11-29RGB no more: Minimally-decoded JPEG Vision TransformersJeongsoo Park et.al.2211.16421:mortar_board:None
2022-11-29Lightweight Structure-Aware Attention for Visual UnderstandingHeeseung Kwon et.al.2211.16289:mortar_board:None
2022-11-29Metal-conscious Embedding for CBCT Projection InpaintingFuxin Fan et.al.2211.16219:mortar_board:None
2022-11-29NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision TransformersYijiang Liu et.al.2211.16056:mortar_board:None
2022-11-29LUMix: Improving Mixup by Better Modelling Label UncertaintyShuyang Sun et.al.2211.15846:mortar_board:None
2022-11-28Good helper is around you: Attention-driven Masked Image ModelingZhengqi Liu et.al.2211.15362:mortar_board:Code
2022-11-27Semantic-Aware Local-Global Vision TransformerJiatong Zhang et.al.2211.14705:mortar_board:None
2022-11-26Game Theoretic Mixed Experts for Combinational Adversarial Machine LearningEthan Rathbun et.al.2211.14669:mortar_board:None
2022-11-26Towards Better Input Masking for Convolutional Neural NetworksSriram Balasubramanian et.al.2211.14646:mortar_board:None
2022-11-26PatchGT: Transformer over Non-trainable Clusters for Learning Graph RepresentationsHan Gao et.al.2211.14425:mortar_board:Code
2022-11-25Degenerate Swin to Win: Plain Window-based Transformer without Sophisticated OperationsTan Yu et.al.2211.14255:mortar_board:None
2022-11-25MPCViT: Searching for MPC-friendly Vision Transformer with Heterogeneous AttentionWenxuan Zeng et.al.2211.13955:mortar_board:None
2022-11-25Spatial-Temporal Attention Network for Open-Set Fine-Grained Image RecognitionJiayin Sun et.al.2211.13940:mortar_board:None
2022-11-25TAOTF: A Two-stage Approximately Orthogonal Training Framework in Deep Neural NetworksTaoyong Cui et.al.2211.13902:mortar_board:None
2022-11-25AFR-Net: Attention-Driven Fingerprint Recognition NetworkSteven A. Grosz et.al.2211.13897:mortar_board:None
2022-11-25Adaptive Attention Link-based Regularization for Vision TransformersHeegon Jin et.al.2211.13852:mortar_board:None
2022-11-24Efficient Zero-shot Visual Search via Target and Context-aware TransformerZhiwei Ding et.al.2211.13470:mortar_board:None
2022-11-23SVFormer: Semi-supervised Video Transformer for Action RecognitionZhen Xing et.al.2211.13222:mortar_board:Code
2022-11-23CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual LearningJames Seale Smith et.al.2211.13218:mortar_board:None
2022-11-23Indian Commercial Truck License Plate Detection and Recognition for Weighbridge AutomationSiddharth Agrawal et.al.2211.13194:mortar_board:None
2022-11-23ASiT: Audio Spectrogram vIsion Transformer for General Audio RepresentationSara Atito et.al.2211.13189:mortar_board:None
2022-11-23Data Augmentation Vision Transformer for Fine-grained Image ClassificationChao Hu et.al.2211.12879:mortar_board:None
2022-11-22Improving Robust Generalization by Direct PAC-Bayesian Bound MinimizationZifan Wang et.al.2211.12624:mortar_board:None
2022-11-22MagicPony: Learning Articulated 3D Animals in the WildShangzhe Wu et.al.2211.12497:mortar_board:None
2022-11-22TranViT: An Integrated Vision Transformer Framework for Discrete Transit Travel Time Range PredictionAwad Abdelhalim et.al.2211.12322:mortar_board:None
2022-11-22Generalizable Industrial Visual Anomaly Detection with Self-Induction Vision TransformerHaiming Yao et.al.2211.12311:mortar_board:None
2022-11-22Gated Class-Attention with Cascaded Feature Drift Compensation for Exemplar-free Continual Learning of Vision TransformersMarco Cotogni et.al.2211.12292:mortar_board:Code
2022-11-22Transformer Based Multi-Grained Features for Unsupervised Person Re-IdentificationJiachen Li et.al.2211.12280:mortar_board:None
2022-11-22Conv2Former: A Simple Transformer-Style ConvNet for Visual RecognitionQibin Hou et.al.2211.11943:mortar_board:Code
2022-11-21Beyond Attentive Tokens: Incorporating Token Importance and Diversity for Efficient Vision TransformersSifan Long et.al.2211.11315:mortar_board:None
2022-11-21On the Robustness, Generalization, and Forgetting of Shape-Texture Debiased Continual LearningZenglin Shi et.al.2211.11174:mortar_board:None
2022-11-21Vision Transformer with Super Token SamplingHuaibo Huang et.al.2211.11167:mortar_board:None
2022-11-20Overfreezing Meets Overparameterization: A Double Descent Perspective on Transfer Learning of Deep Neural NetworksYehuda Dar et.al.2211.11074:mortar_board:None
2022-11-20Hybrid Transformer Based Feature Fusion for Self-Supervised Monocular Depth EstimationSnehal Singh Tomar et.al.2211.11066:mortar_board:None
2022-11-19Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer TrainingZhenglun Kong et.al.2211.10801:mortar_board:None
2022-11-18Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer InferenceHaoran You et.al.2211.10526:mortar_board:None
2022-11-18Improved Cross-view Completion Pre-training for Stereo MatchingPhilippe Weinzaepfel et.al.2211.10408:mortar_board:None
2022-11-18Vision Transformers in Medical Imaging: A ReviewEmerald U. Henry et.al.2211.10043:mortar_board:None
2022-11-17EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual BackbonesYulin Wang et.al.2211.09703:mortar_board:Code
2022-11-17CPT-V: A Contrastive Approach to Post-Training Quantization of Vision TransformersNatalia Frumkin et.al.2211.09643:mortar_board:None
2022-11-17UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormerKunchang Li et.al.2211.09552:mortar_board:Code
2022-11-17Detecting Arbitrary Keypoints on Limbs and Skis with Sparse Partly Correct Segmentation MasksKatja Ludwig et.al.2211.09446:mortar_board:Code
2022-11-17How to Fine-Tune Vision Models with SGDAnanya Kumar et.al.2211.09359:mortar_board:None
2022-11-16Differentially Private Optimizers Can Learn Adversarially Robust ModelsYuan Zhang et.al.2211.08942:mortar_board:None
2022-11-13Demystify Self-Attention in Vision Transformers from a Semantic Perspective: Analysis and ApplicationLeijie Wu et.al.2211.08543:mortar_board:None
2022-11-15HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision TransformersPeiyan Dong et.al.2211.08110:mortar_board:None
2022-11-15ShadowDiffusion: Diffusion-based Shadow Removal using Classifier-driven Attention and Structure PreservationYeying Jin et.al.2211.08089:mortar_board:None
2022-11-15Using Human Perception to Regularize Transfer LearningJustin Dulay et.al.2211.07885:mortar_board:None
2022-11-14CabViT: Cross Attention among Blocks for Vision TransformerHaokui Zhang et.al.2211.07198:mortar_board:Code
2022-11-14Unsupervised Galaxy Morphological Visual Representation with Deep Contrastive LearningShoulin Wei et.al.2211.07168:mortar_board:Code
2022-11-14BiViT: Extremely Compressed Binary Vision TransformerYefei He et.al.2211.07091:mortar_board:None
2022-11-12MultiCrossViT: Multimodal Vision Transformer for Schizophrenia Prediction using Structural MRI and Functional Network Connectivity DataYuda Bi et.al.2211.06726:mortar_board:None
2022-11-12AU-Aware Vision Transformers for Biased Facial Expression RecognitionShuyi Mao et.al.2211.06609:mortar_board:None
2022-11-12End-to-End Machine Learning Framework for Facial AU Detection in Intensive Care UnitsSubhash Nerella et.al.2211.06570:mortar_board:None
2022-11-11A Comprehensive Survey of Transformers for Computer VisionSonain Jamil et.al.2211.06004:mortar_board:None
2022-11-10Demystify Transformers & Convolutions in Modern Image Deep NetworksJifeng Dai et.al.2211.05781:mortar_board:Code
2022-11-10InternImage: Exploring Large-Scale Vision Foundation Models with Deformable ConvolutionsWenhai Wang et.al.2211.05778:mortar_board:Code
2022-11-09Training a Vision Transformer from scratch in less than 24 hours with 1 GPUSaghar Irandoust et.al.2211.05187:mortar_board:None
2022-11-09ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision Transformer Acceleration with a Linear Taylor AttentionJyotikrishna Dass et.al.2211.05109:mortar_board:None
2022-11-09Pure Transformer with Integrated Experts for Scene Text RecognitionYew Lee Tan et.al.2211.04963:mortar_board:None
2022-11-09Masked Vision-Language Transformers for Scene Text RecognitionJie Wu et.al.2211.04785:mortar_board:Code
2022-11-08Splitting expands the application range of Vision Transformer – variable Vision Transformer (vViT)Takuma Usuzaki et.al.2211.03992:mortar_board:None
2022-11-07CoNMix for Source-free Single and Multi-target Domain AdaptationVikash Kumar et.al.2211.03876:mortar_board:None
2022-11-07Novel Muscle Monitoring by Radiomyography(RMG) and Application to Hand Gesture RecognitionZijing Zhang et.al.2211.03767:mortar_board:None
2022-11-07Group DETR v2: Strong Object Detector with Encoder-Decoder PretrainingQiang Chen et.al.2211.03594:mortar_board:None
2022-11-07Efficient Multi-order Gated Aggregation NetworkSiyuan Li et.al.2211.03295:mortar_board:Code
2022-11-06ViT-CX: Causal Explanation of Vision TransformersWeiyan Xie et.al.2211.03064:mortar_board:None
2022-11-04RCDPT: Radar-Camera fusion Dense Prediction TransformerChen-Chou Lo et.al.2211.02432:mortar_board:None
2022-11-04SPEAKER VGG CCT: Cross-corpus Speech Emotion Recognition with Speaker Embedding and Vision TransformersA. Arezzo et.al.2211.02366:mortar_board:Code
2022-11-04Boosting Binary Neural Networks via Dynamic Thresholds LearningJiehua Zhang et.al.2211.02292:mortar_board:None
2022-11-03Rethinking Hierarchies in Pre-trained Plain Vision TransformerYufei Xu et.al.2211.01785:mortar_board:None
2022-11-03Evaluating a Synthetic Image Dataset Generated with Stable DiffusionAndreas Stöckl et.al.2211.01777:mortar_board:None
2022-11-02The Lottery Ticket Hypothesis for Vision TransformersXuan Shen et.al.2211.01484:mortar_board:None
2022-11-02Attention-based Neural Cellular AutomataMattie Tesfaldet et.al.2211.01233:mortar_board:None
2022-11-02RegCLR: A Self-Supervised Framework for Tabular Representation Learning in the WildWeiyao Wang et.al.2211.01165:mortar_board:None
2022-11-02WITT: A Wireless Image Transmission Transformer for Semantic CommunicationsKe Yang et.al.2211.00937:mortar_board:Code
2022-11-01ViT-DeiT: An Ensemble Model for Breast Cancer Histopathological Images ClassificationAmira Alotaibi et.al.2211.00749:mortar_board:None
2022-10-31Max Pooling with Vision Transformers reconciles class and shape in weakly supervised semantic segmentationSimone Rossetti et.al.2210.17400:mortar_board:Code
2022-10-31ViT-LSLA: Vision Transformer with Light Self-Limited-AttentionZhenzhe Hechen et.al.2210.17115:mortar_board:None
2022-10-30ViTASD: Robust Vision Transformer Baselines for Autism Spectrum Disorder Facial DiagnosisXu Cao et.al.2210.16943:mortar_board:Code
2022-10-30Foreign Object Debris Detection for Airport Pavement Images based on Self-supervised Localization and Vision TransformerTravis Munyer et.al.2210.16901:mortar_board:None
2022-10-30Exemplar Guided Deep Neural Network for Spatial Transcriptomics Analysis of Gene Expression PredictionYan Yang et.al.2210.16721:mortar_board:Code
2022-10-29ImplantFormer: Vision Transformer based Implant Position Regression Using Dental CBCT DataXinquan Yang et.al.2210.16467:mortar_board:None
2022-10-28Multimodal Transformer for Parallel Concatenated Variational AutoencodersStephen D. Liang et.al.2210.16174:mortar_board:None
2022-10-28Federated Learning for Chronic Obstructive Pulmonary Disease Classification with Partial Personalized Attention MechanismYiqing Shen et.al.2210.16142:mortar_board:None
2022-10-28Differentially Private CutMix for Split Learning with Vision TransformerSeungeun Oh et.al.2210.15986:mortar_board:None
2022-10-28Grafting Vision TransformersJongwoo Park et.al.2210.15943:mortar_board:None
2022-10-27Fully-attentive and interpretable: vision and video vision transformers for pain detectionGiacomo Fiorentini et.al.2210.15769:mortar_board:Code
2022-10-27PatchRot: A Self-Supervised Technique for Training Vision TransformersSachin Chhabra et.al.2210.15722:mortar_board:Code
2022-10-27Masked Transformer for image Anomaly LocalizationAxel De Nardin et.al.2210.15540:mortar_board:None
2022-10-27Li3DeTr: A LiDAR based 3D Detection TransformerGopi Krishna Erabati et.al.2210.15365:mortar_board:None
2022-10-27Vision Transformer for Adaptive Image Transmission over MIMO ChannelsHaotian Wu et.al.2210.15347:mortar_board:None
2022-10-27MSF3DDETR: Multi-Sensor Fusion 3D Detection Transformer for Autonomous DrivingGopi Krishna Erabati et.al.2210.15316:mortar_board:None
2022-10-27Spatio-Temporal Hybrid Fusion of CAE and SWIn Transformers for Lung Cancer Malignancy PredictionSadaf Khademi et.al.2210.15297:mortar_board:None
2022-10-27ViT-CAT: Parallel Vision Transformers with Cross Attention Fusion for Popularity Prediction in MEC NetworksZohreh HajiAkhondi-Meybodi et.al.2210.15125:mortar_board:None
2022-10-27Masked Vision-Language Transformer in FashionGe-Peng Ji et.al.2210.15110:mortar_board:Code
2022-10-26M3^3ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-designHanxue Liang et.al.2210.14793:mortar_board:Code
2022-10-26TPFNet: A Novel Text In-painting Transformer for Text RemovalOnkar Susladkar et.al.2210.14461:mortar_board:Code
2022-10-25Explicitly Increasing Input Information Density for Vision Transformers on Small DatasetsXiangyu Chen et.al.2210.14319:mortar_board:None
2022-10-25Learning Explicit Object-Centric Representations with Vision TransformersOscar Vikström et.al.2210.14139:mortar_board:None
2022-10-25Minutiae-Guided Fingerprint Embeddings via Vision TransformersSteven A. Grosz et.al.2210.13994:mortar_board:None
2022-10-24The Robustness Limits of SoTA Vision Models to Natural VariationMark Ibrahim et.al.2210.13604:mortar_board:None
2022-10-23Adversarial Pretraining of Self-Supervised Deep Networks: Past, Present and FutureGuo-Jun Qi et.al.2210.13463:mortar_board:None
2022-10-23Delving into Masked Autoencoders for Multi-Label Thorax Disease ClassificationJunfei Xiao et.al.2210.12843:mortar_board:Code
2022-10-23UIA-ViT: Unsupervised Inconsistency-Aware Method based on Vision Transformer for Face Forgery DetectionWanyi Zhuang et.al.2210.12752:mortar_board:None
2022-10-23Accelerated Linearized Laplace Approximation for Bayesian Deep LearningZhijie Deng et.al.2210.12642:mortar_board:Code
2022-10-22S2WAT: Image Style Transfer via Hierarchical Vision Transformer using Strips Window AttentionChiyu Zhang et.al.2210.12381:mortar_board:None
2022-10-22Accumulated Trivial Attention Matters in Vision Transformers on Small DatasetsXiangyu Chen et.al.2210.12333:mortar_board:Code
2022-10-21High-Fidelity Visual Structural Inspections through Transformers and Learnable ResizersKareem Eltouny et.al.2210.12175:mortar_board:None
2022-10-21Face Pyramid Vision TransformerKhawar Islam et.al.2210.11974:mortar_board:Code
2022-10-21Boosting vision transformers for image retrievalChull Hwan Song et.al.2210.11909:mortar_board:Code
2022-10-20GPR-Net: Multi-view Layout Estimation via a Geometry-aware Panorama Registration NetworkJheng-Wei Su et.al.2210.11419:mortar_board:None
2022-10-20General Image Descriptors for Open World Image Retrieval using ViT CLIPMarcos V. Conde et.al.2210.11141:mortar_board:Code
2022-10-20SimpleClick: Interactive Image Segmentation with Simple Vision TransformersQin Liu et.al.2210.11006:mortar_board:Code
2022-10-19A Unified View of Masked Image ModelingZhiliang Peng et.al.2210.10615:mortar_board:None
2022-10-19Cross-Modal Fusion Distillation for Fine-Grained Sketch-Based Image RetrievalAbhra Chaudhuri et.al.2210.10486:mortar_board:None
2022-10-19Multi-view Gait Recognition based on Siamese Vision TransformerYanchen Yang et.al.2210.10421:mortar_board:None
2022-10-18Number-Adaptive Prototype Learning for 3D Point Cloud Semantic SegmentationYangheng Zhao et.al.2210.09948:mortar_board:None
2022-10-18Sequence and Circle: Exploring the Relationship Between PatchesZhengyang Yu et.al.2210.09871:mortar_board:None
2022-10-18Decoupling Features in Hierarchical Propagation for Video Object SegmentationZongxin Yang et.al.2210.09782:mortar_board:Code
2022-10-18ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-DesignHaoran You et.al.2210.09573:mortar_board:None
2022-10-18Swinv2-Imagen: Hierarchical Vision Transformer Diffusion Models for Text-to-Image GenerationRuijun Li et.al.2210.09549:mortar_board:None
2022-10-14oViT: An Accurate Second-Order Pruning Framework for Vision TransformersDenis Kuznedelev et.al.2210.09223:mortar_board:None
2022-10-17Histopathological Image Classification based on Self-Supervised Vision Transformer and Weak LabelsAhmet Gokberk Gul et.al.2210.09021:mortar_board:None
2022-10-16Learning Self-Regularized Adversarial Views for Self-Supervised Vision TransformersTao Tang et.al.2210.08458:mortar_board:Code
2022-10-16Scratching Visual Transformer’s Back with Uniform AttentionNam Hyeon-Woo et.al.2210.08457:mortar_board:None
2022-10-15Transformer-based dimensionality reductionRuisheng Ran et.al.2210.08288:mortar_board:None
2022-10-15Distributionally Robust Multiclass Classification and Applications in Deep Image ClassifiersRuidi Chen et.al.2210.08198:mortar_board:None
2022-10-15Linear Video Transformer with Feature FixationKaiyue Lu et.al.2210.08164:mortar_board:None
2022-10-14Optimizing Vision Transformers for Medical Image Segmentation and Few-Shot Domain AdaptationQianying Liu et.al.2210.08066:mortar_board:None
2022-10-14Vision Transformer Visualization: What Neurons Tell and How Neurons Behave?Van-Anh Nguyen et.al.2210.07646:mortar_board:Code
2022-10-14When Adversarial Training Meets Vision Transformers: Recipes from Training to ArchitectureYichuan Mo et.al.2210.07540:mortar_board:Code
2022-10-13How to Train Vision Transformer on Small-scale Datasets?Hanan Gani et.al.2210.07240:mortar_board:Code
2022-10-13Feature-Proxy Transformer for Few-Shot SegmentationJian-Wei Zhang et.al.2210.06908:mortar_board:Code
2022-10-13Q-ViT: Accurate and Fully Quantized Low-bit Vision TransformerYanjing Li et.al.2210.06707:mortar_board:Code
2022-10-12S4ND: Modeling Images and Videos as Multidimensional Signals Using State SpacesEric Nguyen et.al.2210.06583:mortar_board:None
2022-10-12Prompt Generation Networks for Efficient Adaptation of Frozen Vision TransformersJochem Loedeman et.al.2210.06466:mortar_board:Code
2022-10-12Token-Label Alignment for Vision TransformersHan Xiao et.al.2210.06455:mortar_board:Code
2022-10-12Foundation TransformersHongyu Wang et.al.2210.06423:mortar_board:None
2022-10-12Distilling Knowledge from Language Models for Video-based Action AnticipationSayontan Ghosh et.al.2210.05991:mortar_board:None
2022-10-12GGViT:Multistream Vision Transformer Network in Face2Face Facial Reenactment DetectionHaotian Wu et.al.2210.05990:mortar_board:None
2022-10-12Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small DatasetsZhiying Lu et.al.2210.05958:mortar_board:Code
2022-10-12Towards Theoretically Inspired Neural Initialization OptimizationYibo Yang et.al.2210.05956:mortar_board:Code
2022-10-12Dynamic Clustering Network for Unsupervised Semantic SegmentationKehan Li et.al.2210.05944:mortar_board:None
2022-10-12SegViT: Semantic Segmentation with Plain Vision TransformersBowen Zhang et.al.2210.05844:mortar_board:None
2022-10-11SaiT: Sparse Vision Transformers through Adaptive Token PruningLing Li et.al.2210.05832:mortar_board:None
2022-10-11OPERA: Omni-Supervised Representation Learning with Hierarchical SupervisionsChengkun Wang et.al.2210.05557:mortar_board:Code
2022-10-11What does a deep neural network confidently perceive? The effective dimension of high certainty class manifolds and their low confidence boundariesStanislav Fort et.al.2210.05546:mortar_board:Code
2022-10-11UGformer for Robust Left Atrium and Scar Segmentation Across ScannersTianyi Liu et.al.2210.05151:mortar_board:None
2022-10-10Revisiting adapters with adversarial trainingSylvestre-Alvise Rebuffi et.al.2210.04886:mortar_board:None
2022-10-10Visual Prompt Tuning for Test-time Domain AdaptationYunhe Gao et.al.2210.04831:mortar_board:None
2022-10-09Students taught by multimodal teachers are superior action recognizersGorjan Radevski et.al.2210.04331:mortar_board:None
2022-10-09Strong Gravitational Lensing Parameter Estimation with Vision TransformerKuan-Wei Huang et.al.2210.04143:mortar_board:Code
2022-10-08Fast-ParC: Position Aware Global Kernel for ConvNets and ViTsTao Yang et.al.2210.04020:mortar_board:None
2022-10-07Game-Theoretic Understanding of MisclassificationKosuke Sumiyasu et.al.2210.03349:mortar_board:None
2022-10-07Polyhistor: Parameter-Efficient Multi-Task Adaptation for Dense Vision TasksYen-Cheng Liu et.al.2210.03265:mortar_board:None
2022-10-06Gastrointestinal Disorder Detection with a Transformer Based ApproachA. K. M. Salman Hosain et.al.2210.03168:mortar_board:None
2022-10-06Real-World Robot Learning with Masked Visual Pre-trainingIlija Radosavovic et.al.2210.03109:mortar_board:None
2022-10-06Structure Representation Network and Uncertainty Feedback Learning for Dense Non-Uniform Fog RemovalYeying Jin et.al.2210.03061:mortar_board:Code
2022-10-06SynBench: Task-Agnostic Benchmarking of Pretrained Representations using Synthetic DataChing-Yun Ko et.al.2210.02989:mortar_board:None
2022-10-06The Lie Derivative for Measuring Learned EquivarianceNate Gruver et.al.2210.02984:mortar_board:Code
2022-10-06Vision Transformer Based Model for Describing a Set of Images as a StoryZainy M. Malakan et.al.2210.02762:mortar_board:None
2022-10-05Centralized Feature Pyramid for Object DetectionYu Quan et.al.2210.02093:mortar_board:Code
2022-10-05Exploring The Role of Mean Teachers in Self-supervised Masked Auto-EncodersYoungwan Lee et.al.2210.02077:mortar_board:None
2022-10-04Multi-view Human Body Mesh TranslatorXiangjian Jiang et.al.2210.01886:mortar_board:None
2022-10-04Towards Flexible Inductive Bias via Progressive Reparameterization SchedulingYunsung Lee et.al.2210.01370:mortar_board:None
2022-10-03Introducing Vision Transformer for Alzheimer’s Disease classification task with 3D inputZilun Zhang et.al.2210.01177:mortar_board:None
2022-10-03Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuningWeicong Liang et.al.2210.01035:mortar_board:None
2022-10-03Visual Prompt Tuning for Generative Transfer LearningKihyuk Sohn et.al.2210.00990:mortar_board:None
2022-10-03Attention Distillation: self-supervised vision transformer students need more guidanceKai Wang et.al.2210.00944:mortar_board:None
2022-10-03A Strong Transfer Baseline for RGB-D Fusion in Vision TransformersGeorgios Tziafas et.al.2210.00843:mortar_board:None
2022-10-02Deep-OCTA: Ensemble Deep Learning Approaches for Diabetic Retinopathy Analysis on OCTA ImagesJunlin Hou et.al.2210.00515:mortar_board:Code
2022-10-01CAST: Concurrent Recognition and Segmentation with Adaptive Segment TokensTsung-Wei Ke et.al.2210.00314:mortar_board:None
2022-10-01EAPruning: Evolutionary Pruning for Vision Transformers and CNNsQingyuan Li et.al.2210.00181:mortar_board:None
2022-09-30Impact of Face Image Quality Estimation on Presentation Attack DetectionCarlos Aravena et.al.2209.15489:mortar_board:None
2022-09-30Diffusion-based Image Translation using Disentangled Style and Content RepresentationGihyun Kwon et.al.2209.15264:mortar_board:Code
2022-09-30Dual Progressive Transformations for Weakly Supervised Semantic SegmentationDongjian Huo et.al.2209.15211:mortar_board:Code
2022-09-30MobileViTv3: Mobile-Friendly Vision Transformer with Simple and Effective Fusion of Local, Global and Input FeaturesShakti N. Wadekar et.al.2209.15159:mortar_board:Code
2022-09-293D UX-Net: A Large Kernel Volumetric ConvNet Modernizing Hierarchical Transformer for Medical Image SegmentationHo Hin Lee et.al.2209.15076:mortar_board:Code
2022-09-29Effective Vision Transformer Training: A Data-Centric PerspectiveBenjia Zhou et.al.2209.15006:mortar_board:None
2022-09-29Dilated Neighborhood Attention TransformerAli Hassani et.al.2209.15001:mortar_board:Code
2022-09-28UNesT: Local Spatial Representation Learning with Hierarchical Transformer for Efficient Medical SegmentationXin Yu et.al.2209.14378:mortar_board:Code
2022-09-28360FusionNeRF: Panoramic Neural Radiance Fields with Joint GuidanceShreyas Kulkarni et.al.2209.14265:mortar_board:Code
2022-09-28Exploring the Relationship between Architecture and Adversarially Robust GeneralizationShiyu Tang et.al.2209.14105:mortar_board:None
2022-09-28Motion Transformer for Unsupervised Image AnimationJiale Tao et.al.2209.14024:mortar_board:Code
2022-09-28DeViT: Deformed Vision Transformers in Video InpaintingJiayin Cai et.al.2209.13925:mortar_board:None
2022-09-28Adaptive Sparse ViT: Towards Learnable Adaptive Token Pruning by Fully Exploiting Self-AttentionXiangcheng Liu et.al.2209.13802:mortar_board:None
2022-09-28Attacking Compressed Vision TransformersSwapnil Parekh et.al.2209.13785:mortar_board:None
2022-09-28MTU-Net: Multi-level TransUNet for Space-based Infrared Tiny Ship DetectionTianhao Wu et.al.2209.13756:mortar_board:Code
2022-09-27FG-UAP: Feature-Gathering Universal Adversarial PerturbationZhixing Ye et.al.2209.13113:mortar_board:None
2022-09-26Generalized Parametric Contrastive LearningJiequan Cui et.al.2209.12400:mortar_board:Code
2022-09-25All are Worth Words: a ViT Backbone for Score-based Diffusion ModelsFan Bao et.al.2209.12152:mortar_board:None
2022-09-23Wide-Area Geolocalization with a Limited Field of View CameraLena M. Downes et.al.2209.11854:mortar_board:None
2022-09-23NasHD: Efficient ViT Architecture Performance Ranking using Hyperdimensional ComputingDongning Ma et.al.2209.11356:mortar_board:None
2022-09-22Colonoscopy Landmark Detection using Vision TransformersAniruddha Tamhane et.al.2209.11304:mortar_board:None
2022-09-20Traffic Accident Risk Forecasting using Contextual Vision TransformersKhaled Saleh et.al.2209.11180:mortar_board:None
2022-09-22Pretraining the Vision Transformer using self-supervised methods for vision based Deep Reinforcement LearningManuel Goulão et.al.2209.10901:mortar_board:Code
2022-09-21PicT: A Slim Weakly Supervised Vision Transformer for Pavement Distress ClassificationWenhao Tang et.al.2209.10074:mortar_board:None
2022-09-19Multi-Task Vision Transformer for Semi-Supervised Driver Distraction DetectionYunsheng Ma et.al.2209.09178:mortar_board:Code
2022-09-19Panoramic Vision Transformer for Saliency Detection in 360° VideosHeeseung Yun et.al.2209.08956:mortar_board:None
2022-09-19Estimating Brain Age with Global and Local DependenciesYanwu Yang et.al.2209.08933:mortar_board:None
2022-09-19HiMFR: A Hybrid Masked Face Recognition Through Face InpaintingMd Imran Hosen et.al.2209.08930:mortar_board:Code
2022-09-19Attentive Symmetric Autoencoder for Brain MRI SegmentationJunjia Huang et.al.2209.08887:mortar_board:Code
2022-09-19Axially Expanded Windows for Local-Global Interaction in Vision TransformersZhemin Zhang et.al.2209.08726:mortar_board:None
2022-09-19Uncertainty Aware Multitask Pyramid Vision Transformer For UAV-Based Object Re-IdentificationSyeda Nyma Ferdous et.al.2209.08686:mortar_board:None
2022-09-16PPT: token-Pruned Pose Transformer for monocular and multi-view human pose estimationHaoyu Ma et.al.2209.08194:mortar_board:Code
2022-09-16Quantum Vision TransformersEl Amine Cherrat et.al.2209.08167:mortar_board:None
2022-09-16Self-Supervised Learning of Phenotypic Representations from Cell Images with Weak LabelsJan Oscar Cross-Zamirski et.al.2209.07819:mortar_board:None
2022-09-16ConvFormer: Closing the Gap Between CNN and Vision TransformersZimian Wei et.al.2209.07738:mortar_board:None
2022-09-16A Mosquito is Worth 16x16 Larvae: Evaluation of Deep Learning Architectures for Mosquito Larvae ClassificationAswin Surya et.al.2209.07718:mortar_board:Code
2022-09-16Hybrid Window Attention Based Transformer Architecture for Brain Tumor SegmentationHimashi Peiris et.al.2209.07704:mortar_board:Code
2022-09-15Medical Image Segmentation using LeViT-UNet++: A Case Study on GI Tract DataPraneeth Nemani et.al.2209.07515:mortar_board:None
2022-09-15Hydra Attention: Efficient Attention with Many HeadsDaniel Bolya et.al.2209.07484:mortar_board:None
2022-09-15On the Surprising Effectiveness of Transformers in Low-Labeled Video RecognitionFarrukh Rahman et.al.2209.07474:mortar_board:None
2022-09-15A Light Recipe to Train Robust Vision TransformersEdoardo Debenedetti et.al.2209.07399:mortar_board:Code
2022-09-15Can We Solve 3D Vision Tasks Starting from A 2D Vision Transformer?Yi Wang et.al.2209.07026:mortar_board:Code
2022-09-15PriorLane: A Prior Knowledge Enhanced Lane Detection Approach Based on TransformerQibo Qiu et.al.2209.06994:mortar_board:Code
2022-09-14On the interplay of adversarial robustness and architecture components: patches, convolution and attentionFrancesco Croce et.al.2209.06953:mortar_board:None
2022-09-14PaLI: A Jointly-Scaled Multilingual Language-Image ModelXi Chen et.al.2209.06794:mortar_board:None
2022-09-14Transformers and CNNs both Beat Humans on SBIROmar Seddati et.al.2209.06629:mortar_board:None
2022-09-13DMTNet: Dynamic Multi-scale Network for Dual-pixel Images Defocus Deblurring with TransformerDafeng Zhang et.al.2209.06040:mortar_board:None
2022-09-13A lightweight Transformer-based model for fish landmark detectionAlzayat Saleh et.al.2209.05777:mortar_board:None
2022-09-13Vision Transformers for Action Recognition: A SurveyAnwaar Ulhaq et.al.2209.05700:mortar_board:None
2022-09-13PSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for Vision TransformersZhikai Li et.al.2209.05687:mortar_board:Code
2022-09-13ComplETR: Reducing the cost of annotations for object detection in dense scenes with vision transformersAchin Jain et.al.2209.05654:mortar_board:None
2022-09-07Transfer Learning and Vision Transformer based State-of-Health prediction of Lithium-Ion BatteriesPengyu Fu et.al.2209.05253:mortar_board:None
2022-09-12Vision Transformer with Convolutional Encoder-Decoder for Hand Gesture Recognition using 24 GHz Doppler RadarKavinda Kehelella et.al.2209.05032:mortar_board:None
2022-09-09EchoCoTr: Estimation of the Left Ventricular Ejection Fraction from Spatiotemporal EchocardiographyRand Muhtaseb et.al.2209.04242:mortar_board:None
2022-09-07Prior Knowledge-Guided Attention in Self-Supervised Vision TransformersKevin Miao et.al.2209.03745:mortar_board:None
2022-09-08Multi-Granularity Prediction for Scene Text RecognitionPeng Wang et.al.2209.03592:mortar_board:None
2022-09-08Video Vision Transformers for Violence DetectionSanskar Singh et.al.2209.03561:mortar_board:None
2022-09-07Securing the Spike: On the Transferabilty and Security of Spiking Neural Networks to Adversarial ExamplesNuo Xu et.al.2209.03358:mortar_board:None
2022-09-06Fusion of Satellite Images and Weather Data with Transformer Networks for Downy Mildew Disease DetectionWilliam Maillet et.al.2209.02797:mortar_board:None
2022-09-06ViTKD: Practical Guidelines for ViT feature knowledge distillationZhendong Yang et.al.2209.02432:mortar_board:Code
2022-09-06Transformer-CNN Cohort: Semi-supervised Semantic Segmentation by the Best of Both StudentsXu Zheng et.al.2209.02178:mortar_board:None
2022-09-04Time-distance vision transformers in lung cancer diagnosis from longitudinal computed tomographyThomas Z. Li et.al.2209.01676:mortar_board:Code
2022-08-31MAFormer: A Transformer Network with Multi-scale Attention Fusion for Visual RecognitionYunhao Wang et.al.2209.01620:mortar_board:None
2022-09-03Vision Transformers and YoloV5 based Driver Drowsiness Detection FrameworkGhanta Sai Krishna et.al.2209.01401:mortar_board:None
2022-09-02Transformers in Remote Sensing: A SurveyAbdulaziz Amer Aleissaee et.al.2209.01206:mortar_board:None
2022-08-31EViT: Privacy-Preserving Image Retrieval via Encrypted Vision Transformer in Cloud ComputingQihua Feng et.al.2208.14657:mortar_board:Code
2022-08-31SIM-Trans: Structure Information Modeling Transformer for Fine-grained Visual CategorizationHongbo Sun et.al.2208.14607:mortar_board:Code
2022-08-29Open-Set Semi-Supervised Object DetectionYen-Cheng Liu et.al.2208.13722:mortar_board:None
2022-08-28An Unsupervised Learning-based Framework for Effective Representation Extraction of Reactor AccidentsChengyuan Li et.al.2208.13147:mortar_board:None
2022-08-28ClusTR: Exploring Efficient Self-attention via Clustering for Vision TransformersYutong Xie et.al.2208.13138:mortar_board:None
2022-08-28An Access Control Method with Secret Key for Semantic Segmentation ModelsTeru Nagamori et.al.2208.13135:mortar_board:None
2022-08-27TrojViT: Trojan Insertion in Vision TransformersMengxin Zheng et.al.2208.13049:mortar_board:None
2022-08-26VMFormer: End-to-End Video Matting with TransformerJiachen Li et.al.2208.12801:mortar_board:None
2022-08-24On a Built-in Conflict between Deep Learning and Systematic GeneralizationYuanpeng Li et.al.2208.11633:mortar_board:Code
2022-08-20An End-to-End OCR Framework for Robust Arabic-Handwriting Recognition using a Novel Transformers-based Model and an Innovative 270 Million-Words Multi-Font Corpus of Classical Arabic with DiacriticsAly Mostafa et.al.2208.11484:mortar_board:None
2022-08-24A Deep Learning Approach Using Masked Image Modeling for Reconstruction of Undersampled K-spacesKyler Larsen et.al.2208.11472:mortar_board:Code
2022-08-24Federated Self-Supervised Contrastive Learning and Masked Autoencoder for Dermatological Disease DiagnosisYawen Wu et.al.2208.11278:mortar_board:None
2022-08-23FocusFormer: Focusing on What We Need via Architecture SamplerJing Liu et.al.2208.10861:mortar_board:None
2022-08-22Predicting microsatellite instability and key biomarkers in colorectal cancer from H&E-stained images: Achieving SOTA with Less Data using Swin TransformerBangwei Guo et.al.2208.10495:mortar_board:None
2022-08-22ProtoPFormer: Concentrating on Prototypical Parts in Vision Transformers for Interpretable Image RecognitionMengqi Xue et.al.2208.10431:mortar_board:Code
2022-08-20Analyzing Adversarial Robustness of Vision Transformers against Spatial and Spectral AttacksGihyun Kim et.al.2208.09602:mortar_board:None
2022-08-19A Dual Modality Approach For (Zero-Shot) Multi-Label ClassificationShichao Xu et.al.2208.09562:mortar_board:None
2022-08-19Accelerating Vision Transformer Training via a Patch Sampling ScheduleBradley McDanel et.al.2208.09520:mortar_board:Code
2022-08-18The 8-Point Algorithm as an Inductive Bias for Relative Pose Prediction by ViTsChris Rockwell et.al.2208.08988:mortar_board:None
2022-08-18Prompt Vision Transformer for Domain GeneralizationZangwei Zheng et.al.2208.08914:mortar_board:None
2022-08-17Conviformers: Convolutionally guided Vision TransformerMohit Vaishnav et.al.2208.08900:mortar_board:None
2022-08-17Video-TransUNet: Temporally Blended Vision Transformer for CT VFSS Instance SegmentationChengxi Zeng et.al.2208.08315:mortar_board:Code
2022-08-17Transformer Vs. MLP-Mixer Exponential Expressive Gap For NLP ProblemsDan Navon et.al.2208.08191:mortar_board:None
2022-08-17Data-Efficient Vision Transformers for Multi-Label Disease Classification on Chest RadiographsFinn Behrendt et.al.2208.08166:mortar_board:None
2022-08-16ViT-ReT: Vision and Recurrent Transformer Neural Networks for Human Activity Recognition in VideosJames Wensel et.al.2208.07929:mortar_board:None
2022-08-16Your ViT is Secretly a Hybrid Discriminative-Generative Diffusion ModelXiulong Yang et.al.2208.07791:mortar_board:None
2022-08-10PatchDropout: Economizing Vision Transformers Using Patch DropoutYue Liu et.al.2208.07220:mortar_board:None
2022-08-15A Vision Transformer-Based Approach to Bearing Fault Classification via Vibration SignalsAbid Hasan Zim et.al.2208.07070:mortar_board:None
2022-08-15Self-Supervised Vision Transformers for Malware DetectionSachith Seneviratne et.al.2208.07049:mortar_board:Code
2022-08-14Shuffle Instances-based Vision Transformer for Pancreatic Cancer ROSE Image ClassificationTianyi Zhang et.al.2208.06833:mortar_board:Code
2022-08-13Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep ModelsXingyu Xie et.al.2208.06677:mortar_board:None
2022-08-12When CNN Meet with ViT: Towards Semi-Supervised Learning for Multi-Class Medical Image Semantic SegmentationZiyang Wang et.al.2208.06449:mortar_board:Code
2022-08-12BEiT v2: Masked Image Modeling with Vector-Quantized Visual TokenizersZhiliang Peng et.al.2208.06366:mortar_board:None
2022-08-11Shifted Windows Transformers for Medical Image Quality AssessmentCaner Ozer et.al.2208.06034:mortar_board:None
2022-08-11Semi-supervised Vision Transformers at ScaleZhaowei Cai et.al.2208.05688:mortar_board:None
2022-08-10Auto-ViT-Acc: An FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme QuantizationZhengang Li et.al.2208.05163:mortar_board:None
2022-08-10Ghost-free High Dynamic Range Imaging with Context-aware TransformerZhen Liu et.al.2208.05114:mortar_board:Code
2022-08-09CoViT: Real-time phylogenetics for the SARS-CoV-2 pandemic using Vision TransformersZuher Jahshan et.al.2208.05004:mortar_board:Code
2022-08-07U-Net vs Transformer: Is U-Net Outdated in Medical Image Registration?Xi Jia et.al.2208.04939:mortar_board:None
2022-08-09How Well Do Vision Transformers (VTs) Transfer To The Non-Natural Image Domain? An Empirical Study Involving Art ClassificationVincent Tonkes et.al.2208.04693:mortar_board:Code
2022-08-08Occlusion-Aware Instance Segmentation via BiLayer Network ArchitecturesLei Ke et.al.2208.04438:mortar_board:Code
2022-08-083D Vision with Transformers: A SurveyJean Lahoud et.al.2208.04309:mortar_board:Code
2022-08-08Understanding Masked Image Modeling via Learning Occlusion Invariant FeatureXiangwen Kong et.al.2208.04164:mortar_board:None
2022-08-08Efficient Neural Net Approaches in Metal Casting Defect DetectionRohit Lal et.al.2208.04150:mortar_board:None
2022-08-08Advancing Plain Vision Transformer Towards Remote Sensing Foundation ModelDi Wang et.al.2208.03987:mortar_board:Code
2022-08-06MonoViT: Self-Supervised Monocular Depth Estimation with a Vision TransformerChaoqiang Zhao et.al.2208.03543:mortar_board:Code
2022-08-06Analysing the Memorability of a Procedural Crime-Drama TV Series, CSISean Cummins et.al.2208.03479:mortar_board:None
2022-08-04Self-Ensembling Vision Transformer (SEViT) for Robust Medical Image ClassificationFaris Almalik et.al.2208.02851:mortar_board:Code
2022-08-04DropKeyBonan Li et.al.2208.02646:mortar_board:None
2022-08-04MVSFormer: Multi-View Stereo with Pre-trained Vision Transformers and Temperature-based DepthChenjie Cao et.al.2208.02541:mortar_board:None
2022-08-03GPPF: A General Perception Pre-training Framework via Sparsely Activated Multi-Task LearningBenyuan Sun et.al.2208.02148:mortar_board:None
2022-08-03SSformer: A Lightweight Transformer for Semantic SegmentationWentao Shi et.al.2208.02034:mortar_board:Code
2022-08-03Multi-Feature Vision Transformer via Self-Supervised Representation Learning for Improvement of COVID-19 DiagnosisXiao Qi et.al.2208.01843:mortar_board:Code
2022-08-03Learning Prior Feature and Attention Enhanced Image InpaintingChenjie Cao et.al.2208.01837:mortar_board:Code
2022-08-02Two-Stream Transformer Architecture for Long Video UnderstandingEdward Fish et.al.2208.01753:mortar_board:None
2022-08-02A Novel Transformer Network with Shifted Window Cross-Attention for Spatiotemporal Weather ForecastingAlabi Bojesomo et.al.2208.01252:mortar_board:None
2022-08-01Understanding Adversarial Robustness of Vision Transformers via Cauchy ProblemZheng Wang et.al.2208.00906:mortar_board:Code
2022-07-25D3Former\textrm{D}^3\textrm{Former}: Debiased Dual Distilled Transformer for Incremental LearningAbdelrahman Mohamed et.al.2208.00777:mortar_board:None
2022-08-01TransDeepLab: Convolution-Free Transformer-based DeepLab v3+ for Medical Image SegmentationReza Azad et.al.2208.00713:mortar_board:Code
2022-07-29Restoring Vision in Adverse Weather Conditions with Patch-Based Denoising Diffusion ModelsOzan Özdenizci et.al.2207.14626:mortar_board:Code
2022-07-29ScaleFormer: Revisiting the Transformer-based Backbones from a Scale-wise Perspective for Medical Image SegmentationHuimin Huang et.al.2207.14552:mortar_board:None
2022-07-28HorNet: Efficient High-Order Spatial Interactions with Recursive Gated ConvolutionsYongming Rao et.al.2207.14284:mortar_board:Code
2022-07-28DnSwin: Toward Real-World Denoising via Continuous Wavelet Sliding-TransformerHao Li et.al.2207.13861:mortar_board:None
2022-07-24Online Continual Learning with Contrastive Vision TransformerZhen Wang et.al.2207.13516:mortar_board:None
2022-07-27Deep Clustering with Features from Self-Supervised PretrainingXingzhi Zhou et.al.2207.13364:mortar_board:None
2022-07-27Convolutional Embedding Makes Hierarchical Vision Transformer StrongerCong Wang et.al.2207.13317:mortar_board:None
2022-07-25Self-Distilled Vision Transformer for Domain GeneralizationMaryam Sultana et.al.2207.12392:mortar_board:Code
2022-07-22Applying Spatiotemporal Attention to Identify Distracted and Drowsy Driving with Vision TransformersSamay Lakhani et.al.2207.12148:mortar_board:None
2022-07-25Jigsaw-ViT: Learning Jigsaw Puzzles in Vision TransformerYingyi Chen et.al.2207.11971:mortar_board:None
2022-07-25Behind Every Domain There is a Shift: Adapting Distortion-aware Vision Transformers for Panoramic Semantic SegmentationJiaming Zhang et.al.2207.11860:mortar_board:Code
2022-07-24Improved Super Resolution of MR Images Using CNNs and Vision TransformersDwarikanath Mahapatra et.al.2207.11748:mortar_board:None
2022-07-24Affective Behaviour Analysis Using Pretrained Model with Facial PrioriYifan Li et.al.2207.11679:mortar_board:None
2022-07-24MAR: Masked Autoencoders for Efficient Action RecognitionZhiwu Qing et.al.2207.11660:mortar_board:None
2022-07-22Facial Expression Recognition using Vanilla ViT backbones with MAE PretrainingJia Li et.al.2207.11081:mortar_board:None
2022-07-21Focused Decoding Enables 3D Anatomical Detection by TransformersBastian Wittmann et.al.2207.10774:mortar_board:Code
2022-07-21TinyViT: Fast Pretraining Distillation for Small Vision TransformersKan Wu et.al.2207.10666:mortar_board:Code
2022-07-21Towards Efficient Adversarial Training on Vision TransformersBoxi Wu et.al.2207.10498:mortar_board:None
2022-07-21An Efficient Spatio-Temporal Pyramid Transformer for Action DetectionYuetian Weng et.al.2207.10448:mortar_board:None
2022-07-21A Wavelet Transform and self-supervised learning-based framework for bearing fault diagnosis with limited labeled dataYuhong Jin et.al.2207.10432:mortar_board:None
2022-07-21SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic NetworksChien-Yu Lin et.al.2207.10237:mortar_board:Code
2022-07-20MeshMAE: Masked Autoencoders for 3D Mesh Data AnalysisYaqian Liang et.al.2207.10228:mortar_board:None
2022-07-20Locality Guidance for Improving Vision Transformers on Tiny DatasetsKehan Li et.al.2207.10026:mortar_board:Code
2022-07-20ViGAT: Bottom-up event recognition and explanation in video using factorized graph attention networkNikolaos Gkalelis et.al.2207.09927:mortar_board:None
2022-07-20Unsupervised Industrial Anomaly Detection via Pattern Generative and Contrastive NetworksJianfeng Huang et.al.2207.09792:mortar_board:None
2022-07-20AU-Supervised Convolutional Vision Transformers for Synthetic Facial Expression RecognitionShuyi Mao et.al.2207.09777:mortar_board:Code
2022-07-20On the Versatile Uses of Partial Distance Correlation in Deep LearningXingjian Zhen et.al.2207.09684:mortar_board:Code
2022-07-19Towards Trustworthy Healthcare AI: Attention-Based Feature Learning for COVID-19 Screening With Chest RadiographyKai Ma et.al.2207.09312:mortar_board:None
2022-07-18Is Integer Arithmetic Enough for Deep Learning Training?Alireza Ghaffari et.al.2207.08822:mortar_board:None
2022-07-18Adversarial Pixel Restoration as a Pretext Task for Transferable PerturbationsHashmat Shadab Malik et.al.2207.08803:mortar_board:Code
2022-07-18Multi-manifold Attention for Vision TransformersDimitrios Konstantinidis et.al.2207.08569:mortar_board:None
2022-07-18TokenMix: Rethinking Image Mixing for Data Augmentation in Vision TransformersJihao Liu et.al.2207.08409:mortar_board:Code
2022-07-17Security Evaluation of Compressible Image Encryption for Privacy-Preserving Image Classification against Ciphertext-only AttacksTatsuya Chuman et.al.2207.08109:mortar_board:None
2022-07-16SSMTL++: Revisiting Self-Supervised Multi-Task Learning for Video Anomaly DetectionAntonio Barbalau et.al.2207.08003:mortar_board:None
2022-07-16Explainable vision transformer enabled convolutional neural network for plant disease identification: PlantXViTPoornima Singh Thakur et.al.2207.07919:mortar_board:None
2022-07-15Parameterization of Cross-Token Relations with Relative Positional Encoding for Vision MLPZhicai Wang et.al.2207.07284:mortar_board:Code
2022-07-15Lightweight Vision Transformer with Cross Feature AttentionYoupeng Zhao et.al.2207.07268:mortar_board:None
2022-07-14Convolutional Bypasses Are Better Vision Transformer AdaptersShibo Jie et.al.2207.07039:mortar_board:Code
2022-07-14iColoriT: Towards Propagating Local Hint to the Right Region in Interactive Colorization by Leveraging Vision TransformerSanghyeon Lee et.al.2207.06831:mortar_board:None
2022-07-14Deepfake Video Detection with Spatiotemporal Dropout TransformerDaichi Zhang et.al.2207.06612:mortar_board:None
2022-07-13Trans4Map: Revisiting Holistic Top-down Mapping from Egocentric Images to Allocentric Semantics with Vision TransformersChang Chen et.al.2207.06205:mortar_board:Code
2022-07-12Vision Transformer for NeRF-Based View Synthesis from a Single Input ImageKai-En Lin et.al.2207.05736:mortar_board:None
2022-07-12MSP-Former: Multi-Scale Projection Transformer for Single Image DesnowingSixiang Chen et.al.2207.05621:mortar_board:None
2022-07-12LightViT: Towards Light-Weight Convolution-Free Vision TransformersTao Huang et.al.2207.05557:mortar_board:Code
2022-07-12Long-term Leap Attention, Short-term Periodic Shift for Video ClassificationHao Zhang et.al.2207.05526:mortar_board:None
2022-07-12Next-ViT: Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial ScenariosJiashi Li et.al.2207.05501:mortar_board:None
2022-07-12Image and Model Transformation with Secret Key for Vision TransformerHitoshi Kiya et.al.2207.05366:mortar_board:None
2022-07-12eX-ViT: A Novel eXplainable Vision Transformer for Weakly Supervised Semantic SegmentationLu Yu et.al.2207.05358:mortar_board:None
2022-07-12Outpainting by QueriesKai Yao et.al.2207.05312:mortar_board:Code
2022-07-12Trusted Multi-Scale Classification Framework for Whole Slide ImageMing Feng et.al.2207.05290:mortar_board:None
2022-07-11Wave-ViT: Unifying Wavelet and Transformers for Visual Representation LearningTing Yao et.al.2207.04978:mortar_board:Code
2022-07-11Dual Vision TransformerTing Yao et.al.2207.04976:mortar_board:Code
2022-07-11TNT: Vision Transformer for Turbulence SimulationsYuchen Dang et.al.2207.04616:mortar_board:None
2022-07-10Depthformer : Multiscale Vision Transformer For Monocular Depth Estimation With Local Global Information FusionAshutosh Agarwal et.al.2207.04535:mortar_board:Code
2022-07-10Facilitated machine learning for image-based fruit quality assessment in developing countriesManuel Knott et.al.2207.04523:mortar_board:None
2022-07-08Consecutive Pretraining: A Knowledge Transfer Learning Strategy with Relevant Unlabeled Data for Remote Sensing DomainTong Zhang et.al.2207.03860:mortar_board:None
2022-07-08VidConv: A modernized 2D ConvNet for Efficient Video RecognitionChuong H. Nguyen et.al.2207.03782:mortar_board:None
2022-07-07More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using SparsityShiwei Liu et.al.2207.03620:mortar_board:Code
2022-07-05Softmax-free Linear TransformersJiachen Lu et.al.2207.03341:mortar_board:Code
2022-07-07Vision Transformers: State of the Art and Research ChallengesBo-Kai Ruan et.al.2207.03041:mortar_board:None
2022-07-05Generalization to translation shifts: a study in architectures and augmentationsSuriya Gunasekar et.al.2207.02349:mortar_board:None
2022-07-05TractoFormer: A Novel Fiber-level Whole Brain Tractography Analysis Framework Using Spectral Embedding and Vision TransformersFan Zhang et.al.2207.02327:mortar_board:None
2022-07-05Improving Semantic Segmentation in Transformers using Hierarchical Inter-Level AttentionGary Leung et.al.2207.02126:mortar_board:None
2022-07-05Transformer based Models for Unsupervised Anomaly Segmentation in Brain MR ImagesAhmed Ghorbel et.al.2207.02059:mortar_board:Code
2022-07-05CNN-based Local Vision Transformer for COVID-19 DiagnosisHongyan Xu et.al.2207.02027:mortar_board:None
2022-07-04Dynamic Spatial Sparsification for Efficient Vision Transformers and Convolutional Neural NetworksYongming Rao et.al.2207.01580:mortar_board:Code
2022-07-04I-ViT: Integer-only Quantization for Efficient Vision Transformer InferenceZhikai Li et.al.2207.01405:mortar_board:None
2022-07-03You Only Need One Detector: Unified Object Detector for Different Modalities based on Vision TransformersXiaoke Shen et.al.2207.01071:mortar_board:None
2022-07-01Polarized Color Image Denoising using PocoformerZhuoxiao Li et.al.2207.00215:mortar_board:None
2022-07-01Rethinking Query-Key Pairwise Interactions in Vision TransformersCheng Li et.al.2207.00188:mortar_board:None
2022-06-30PVT-COV19D: Pyramid Vision Transformer for COVID-19 DiagnosisLilang Zheng et.al.2206.15069:mortar_board:None
2022-06-29LViT: Language meets Vision Transformer in Medical Image SegmentationZihan Li et.al.2206.14718:mortar_board:Code
2022-06-29The Lighter The Better: Rethinking Transformers in Medical Image Segmentation Through Adaptive PruningXian Lin et.al.2206.14413:mortar_board:Code
2022-06-28Masked World Models for Visual ControlYounggyo Seo et.al.2206.14244:mortar_board:None
2022-06-28Robustifying Vision Transformer without Retraining from Scratch by Test-Time Class-Conditional Feature AlignmentTakeshi Kojima et.al.2206.13951:mortar_board:Code
2022-06-28Cross-Forgery Analysis of Vision Transformers and CNNs for Deepfake Image DetectionDavide Alessandro Coccomini et.al.2206.13829:mortar_board:None
2022-06-23QbyE-MLPMixer: Query-by-Example Open-Vocabulary Keyword Spotting using MLPMixerJinmiao Huang et.al.2206.13231:mortar_board:None
2022-06-27Video2StyleGAN: Encoding Video in Latent Space for ManipulationJiyang Yu et.al.2206.13078:mortar_board:None
2022-06-26Vision Transformer for Contrastive ClusteringHua-Bao Ling et.al.2206.12925:mortar_board:None
2022-06-24Defending Backdoor Attacks on Vision Transformer via Patch ProcessingKhoa D. Doan et.al.2206.12381:mortar_board:None
2022-06-22Parallel Pre-trained Transformers (PPT) for Synthetic Data-based Instance SegmentationMing Li et.al.2206.10845:mortar_board:None
2022-06-21Scaling up Kernels in 3D CNNsYukang Chen et.al.2206.10555:mortar_board:Code
2022-06-21Vicinity Vision TransformerWeixuan Sun et.al.2206.10552:mortar_board:Code
2022-06-21Faster Diffusion Cardiac MRI with Deep Learning-based breath hold reductionMichael Tanzer et.al.2206.10543:mortar_board:None
2022-06-21Transformers Improve Breast Cancer Diagnosis from Unregistered Multi-View MammogramsXuxin Chen et.al.2206.10096:mortar_board:None
2022-06-20Global Context Vision TransformersAli Hatamizadeh et.al.2206.09959:mortar_board:Code
2022-06-19EATFormer: Improving Vision Transformer Inspired by Evolutionary AlgorithmJiangning Zhang et.al.2206.09325:mortar_board:Code
2022-06-18Replacing Labeled Real-image Datasets with Auto-generated ContoursHirokatsu Kataoka et.al.2206.09132:mortar_board:None
2022-06-17SimA: Simple Softmax-free Attention for Vision TransformersSoroush Abbasi Koohpayegani et.al.2206.08898:mortar_board:Code
2022-06-17Multi-Contextual Predictions with Vision Transformer for Video Anomaly DetectionJoo-Yeon Lee et.al.2206.08568:mortar_board:None
2022-06-17Rectify ViT Shortcut Learning by Visual SaliencyChong Ma et.al.2206.08567:mortar_board:None
2022-06-16Backdoor Attacks on Vision TransformersAkshayvarun Subramanya et.al.2206.08477:mortar_board:Code
2022-06-16IRISformer: Dense Vision Transformers for Single-Image Inverse Rendering in Indoor ScenesRui Zhu et.al.2206.08423:mortar_board:None
2022-06-16OmniMAE: Single Model Masked Pretraining on Images and VideosRohit Girdhar et.al.2206.08356:mortar_board:None
2022-06-16Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking ConsistencyViraj Prabhu et.al.2206.08222:mortar_board:Code
2022-06-16Patch-level Representation Learning for Self-supervised Vision TransformersSukmin Yun et.al.2206.07990:mortar_board:Code
2022-06-15What makes domain generalization hard?Spandan Madan et.al.2206.07802:mortar_board:None
2022-06-15Masked Siamese ConvNetsLi Jing et.al.2206.07700:mortar_board:None
2022-06-15A Simple Data Mixing Prior for Improving Self-Supervised LearningSucheng Ren et.al.2206.07692:mortar_board:Code
2022-06-15SP-ViT: Learning 2D Spatial Priors for Vision TransformersYuxuan Zhou et.al.2206.07662:mortar_board:None
2022-06-15Rethinking Generalization in Few-Shot ClassificationMarkus Hiller et.al.2206.07267:mortar_board:Code
2022-06-14Stand-Alone Inter-Frame Attention in Video ModelsFuchen Long et.al.2206.06931:mortar_board:Code
2022-06-14Efficient Decoder-free Object Detection with TransformersPeixian Chen et.al.2206.06829:mortar_board:Code
2022-06-14Peripheral Vision TransformerJuhong Min et.al.2206.06801:mortar_board:None
2022-06-14Exploring Adversarial Attacks and Defenses in Vision Transformers trained with DINOJavier Rando et.al.2206.06761:mortar_board:Code
2022-06-14TransVG++: End-to-End Visual Grounding with Language Conditioned Vision TransformerJiajun Deng et.al.2206.06619:mortar_board:Code
2022-06-13Multimodal Learning with Transformers: A SurveyPeng Xu et.al.2206.06488:mortar_board:None

3D Representations

Publish DateTitleAuthorsarxivPDFCode
2023-10-23Ghost on the Shell: An Expressive Representation of General 3D ShapesZhen Liu et.al.2310.15168:mortar_board:None
2023-10-22Learning Generalizable Manipulation Policies with Object-Centric 3D RepresentationsYifeng Zhu et.al.2310.14386:mortar_board:None
2023-10-18Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic PromptsXinhua Cheng et.al.2310.11784:mortar_board:None
2023-10-14JM3D & JM3D-LLM: Elevating 3D Representation with Joint Multi-modal CuesJiayi Ji et.al.2310.09503:mortar_board:Code
2023-10-12PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training ParadigmHaoyi Zhu et.al.2310.08586:mortar_board:Code
2023-10-11Orbital Polarimetric Tomography of a Flare Near the Sagittarius A Supermassive Black Hole*Aviad Levis et.al.2310.07687:mortar_board:None
2023-10-10Uni3D: Exploring Unified 3D Representation at ScaleJunsheng Zhou et.al.2310.06773:mortar_board:Code
2023-09-29TextField3D: Towards Enhancing Open-Vocabulary 3D Generation with Noisy Text FieldsTianyu Huang et.al.2309.17175:mortar_board:None
2023-09-29HAvatar: High-fidelity Head Avatar via Facial Model Conditioned Neural Radiance FieldXiaochen Zhao et.al.2309.17128:mortar_board:None
2023-09-28ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and PlanningQiao Gu et.al.2309.16650:mortar_board:None
2023-09-26ITEM3D: Illumination-Aware Directional Texture Editing for 3D ModelsShengqi Liu et.al.2309.14872:mortar_board:None
2023-09-24MM-NeRF: Multimodal-Guided 3D Multi-Style Transfer of Neural Radiance FieldZijiang Yang et.al.2309.13607:mortar_board:None
2023-09-19SPOT: Scalable 3D Pre-training via Occupancy Prediction for Autonomous DrivingXiangchao Yan et.al.2309.10527:mortar_board:Code
2023-09-14Large-Vocabulary 3D Diffusion Model with TransformerZiang Cao et.al.2309.07920:mortar_board:None
2023-09-14CoRF : Colorizing Radiance Fields using Knowledge DistillationAnkit Dhiman et.al.2309.07668:mortar_board:None
2023-09-12Learning Disentangled Avatars with Hybrid 3D RepresentationsYao Feng et.al.2309.06441:mortar_board:None
2023-09-11Diffusion-Guided Reconstruction of Everyday Hand-Object Interaction ClipsYufei Ye et.al.2309.05663:mortar_board:None
2023-09-11PAg-NeRF: Towards fast and efficient end-to-end panoptic 3D representations for agricultural roboticsClaus Smitt et.al.2309.05339:mortar_board:None
2023-09-103D Implicit Transporter for Temporally Consistent Keypoint DiscoveryChengliang Zhong et.al.2309.05098:mortar_board:Code
2023-09-09Graph Vertex ModelTanmoy Sarkar et.al.2309.04818:mortar_board:Code
2023-09-04Neural Vector Fields: Generalizing Distance Vector Fields by Codebooks and Zero-Curl RegularizationXianghui Yang et.al.2309.01512:mortar_board:None
2023-08-14Occ2^2Net: Robust Image Matching Based on 3D Occupancy Estimation for Occluded RegionsMiao Fan et.al.2308.16160:mortar_board:None
2023-08-28HoloFusion: Towards Photo-realistic 3D Generative ModelingAnimesh Karnewar et.al.2308.14244:mortar_board:None
2023-08-27Sparse3D: Distilling Multiview-Consistent Diffusion for Object Reconstruction from Sparse ViewsZi-Xin Zou et.al.2308.14078:mortar_board:None
2023-08-21UniM2^2AE: Multi-modal Masked Autoencoders with Unified 3D Representation for 3D Perception in Autonomous DrivingJian Zou et.al.2308.10421:mortar_board:Code
2023-08-20Strata-NeRF : Neural Radiance Fields for Stratified ScenesAnkit Dhiman et.al.2308.10337:mortar_board:None
2023-08-18Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt TrainingXiaoyang Wu et.al.2308.09718:mortar_board:Code
2023-08-18Invariant Training 2D-3D Joint Hard Samples for Few-Shot Point Cloud RecognitionXuanyu Yi et.al.2308.09694:mortar_board:None
2023-08-18MonoNeRD: NeRF-like Representations for Monocular 3D Object DetectionJunkai Xu et.al.2308.09421:mortar_board:Code
2023-08-17Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D ScenesZehan Wang et.al.2308.08769:mortar_board:None
2023-08-143D Analytics: Opportunities and Guidelines for Information Systems ResearchGunther Gust et.al.2308.08560:mortar_board:None
2023-08-16TeCH: Text-guided Reconstruction of Lifelike Clothed HumansYangyi Huang et.al.2308.08545:mortar_board:Code
2023-08-14Neural radiance fields in the industrial and robotics domain: applications, research opportunities and use casesEugen Šlapak et.al.2308.07118:mortar_board:Code
2023-08-10FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth ModelsGuangkai Xu et.al.2308.05733:mortar_board:None
2023-08-06Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D RepresentationHaowei Wang et.al.2308.02982:mortar_board:Code
2023-08-05Learning Unified Decompositional and Compositional NeRF for Editable Novel View SynthesisYuxin Wang et.al.2308.02840:mortar_board:None
2023-08-05NeRFs: The Search for the Best 3D RepresentationRavi Ramamoorthi et.al.2308.02751:mortar_board:None
2023-07-28VPP: Efficient Conditional 3D Generation via Voxel-Point Progressive RepresentationZekun Qi et.al.2307.16605:mortar_board:Code
2023-07-31JOTR: 3D Joint Contrastive Learning with Transformers for Occluded Human Mesh RecoveryJiahao Li et.al.2307.16377:mortar_board:Code
2023-07-27Learning Full-Head 3D GANs from a Single-View Portrait DatasetYiqian Wu et.al.2307.14770:mortar_board:None
2023-07-20PAPR: Proximity Attention Point RenderingYanshu Zhang et.al.2307.11086:mortar_board:None
2023-07-18Constraining Depth Map Geometry for Multi-View Stereo: A Dual-Depth Approach with Saddle-shaped Depth CellsXinyi Ye et.al.2307.09160:mortar_board:Code
2023-07-18NU-MCC: Multiview Compressive Coding with Neighborhood Decoder and Repulsive UDFStefan Lionar et.al.2307.09112:mortar_board:None
2023-07-12Semantic Communications System with Model Division Multiple Access and Controllable Coding Rate for Point CloudXiaoyi Liu et.al.2307.06027:mortar_board:None
2023-07-11Differentiable Blocks World: Qualitative 3D Decomposition by Rendering PrimitivesTom Monnier et.al.2307.05473:mortar_board:None
2023-06-28Points for Energy Renovation (PointER): A LiDAR-Derived Point Cloud Dataset of One Million English Buildings Linked to Energy CharacteristicsSebastian Krapf et.al.2306.16020:mortar_board:Code
2023-06-27Meshes Meet Voxels: Abdominal Organ Segmentation via Diffeomorphic DeformationsFabian Bongratz et.al.2306.15515:mortar_board:None
2023-06-26RVT: Robotic View Transformer for 3D Object ManipulationAnkit Goyal et.al.2306.14896:mortar_board:Code
2023-06-19UniG3D: A Unified 3D Object Generation DatasetQinghong Sun et.al.2306.10730:mortar_board:None
2023-06-15CAD-Estate: Large-scale CAD Model Annotation in RGB VideosKevis-Kokitsi Maninis et.al.2306.09011:mortar_board:None
2023-06-11On the Efficacy of 3D Point Cloud Reinforcement LearningZhan Ling et.al.2306.06799:mortar_board:Code
2023-06-09GANeRF: Leveraging Discriminators to Optimize Neural Radiance FieldsBarbara Roessle et.al.2306.06044:mortar_board:None
2023-06-08Tracking Objects with 3D Representation from VideosJiawei He et.al.2306.05416:mortar_board:None
2023-06-05ZIGNeRF: Zero-shot 3D Scene Representation with Invertible Generative Neural Radiance FieldsKanghyeok Ko et.al.2306.02741:mortar_board:None
2023-06-05Learning from Multi-View Representation for Point-Cloud Pre-TrainingSiming Yan et.al.2306.02558:mortar_board:None
2023-06-03Efficient Text-Guided 3D-Aware Portrait Generation with Score Distillation Sampling on DistributionYiji Cheng et.al.2306.02083:mortar_board:None
2023-05-26BEV-IO: Enhancing Bird’s-Eye-View 3D Detection with Instance OccupancyZaibin Zhang et.al.2305.16829:mortar_board:None
2023-05-19Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance FieldsJingbo Zhang et.al.2305.11588:mortar_board:None
2023-05-18OpenShape: Scaling Up 3D Shape Representation Towards Open-World UnderstandingMinghua Liu et.al.2305.10764:mortar_board:None
2023-05-15Bridging the Domain Gap: Self-Supervised 3D Scene Understanding with Foundation ModelsZhimin Chen et.al.2305.08776:mortar_board:None
2023-05-14ULIP-2: Towards Scalable Multimodal Pre-training for 3D UnderstandingLe Xue et.al.2305.08275:mortar_board:Code
2023-05-09DexArt: Benchmarking Generalizable Dexterous Manipulation with Articulated ObjectsChen Bao et.al.2305.05706:mortar_board:None
2023-05-03Real-Time Radiance Fields for Single-Image Portrait View SynthesisAlex Trevithick et.al.2305.02310:mortar_board:None
2023-04-27Learning a Diffusion Prior for NeRFsGuandao Yang et.al.2304.14473:mortar_board:None
2023-04-26Ray Conditioning: Trading Photo-consistency for Photo-realism in Multi-view Image GenerationEric Ming Chen et.al.2304.13681:mortar_board:None
2023-04-25PoseVocab: Learning Joint-structured Pose Embeddings for Human Avatar ModelingZhe Li et.al.2304.13006:mortar_board:Code
2023-04-25Hybrid Neural Rendering for Large-Scale Scenes with Motion BlurPeng Dai et.al.2304.12652:mortar_board:None
2023-04-223D-IntPhys: Towards More Generalized 3D-grounded Visual Intuitive Physics under Challenging ScenesHaotian Xue et.al.2304.11470:mortar_board:None
2023-04-22NaviNeRF: NeRF-based 3D Representation Disentanglement by Latent Semantic NavigationBaao Xie et.al.2304.11342:mortar_board:None
2023-04-19Single-View View Synthesis with Self-Rectified Pseudo-StereoYang Zhou et.al.2304.09527:mortar_board:None
2023-04-16Likelihood-Based Generative Radiance Field with Latent Space Energy-Based Model for 3D-Aware Disentangled Image RepresentationYaxuan Zhu et.al.2304.07918:mortar_board:None
2023-04-14UVA: Towards Unified Volumetric Avatar for View Synthesis, Pose rendering, Geometry and Texture EditingJinlong Fan et.al.2304.06969:mortar_board:None
2023-04-13Learning Controllable 3D Diffusion Models from Single-view ImagesJiatao Gu et.al.2304.06700:mortar_board:None
2023-04-13Survey on LiDAR Perception in Adverse Weather ConditionsMariella Dreissig et.al.2304.06312:mortar_board:None
2023-04-11TT-SDF2PC: Registration of Point Cloud and Compressed SDF Directly in the Memory-Efficient Tensor Train DomainAlexey I. Boyko et.al.2304.05342:mortar_board:None
2023-04-11MRVM-NeRF: Mask-Based Pretraining for Neural Radiance FieldsGanlin Yang et.al.2304.04962:mortar_board:None
2023-03-31Exploiting synchrotron X-ray tomography for a novel insight into flax-fibre defects ultrastructureDelphine Quereilhac et.al.2303.18127:mortar_board:None
2023-03-29TriVol: Point Cloud Rendering via Triple VolumesTao Hu et.al.2303.16485:mortar_board:Code
2023-03-24Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation LearningXiaoyang Wu et.al.2303.14191:mortar_board:Code
2023-03-24BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown ObjectsBowen Wen et.al.2303.14158:mortar_board:None
2023-03-24NeuFace: Realistic 3D Neural Face Rendering from Multi-view ImagesMingwu Zheng et.al.2303.14092:mortar_board:Code
2023-03-24SPONGE: Sequence Planning with Deformable-ON-Rigid Contact Prediction from Geometric FeaturesTran Nguyen Le et.al.2303.14012:mortar_board:None
2023-03-24TEGLO: High Fidelity Canonical Texture Mapping from Single-View ImagesVishal Vinod et.al.2303.13743:mortar_board:None
2023-03-23NEWTON: Neural View-Centric Mapping for On-the-Fly Large-Scale SLAMHidenobu Matsuki et.al.2303.13654:mortar_board:None
2023-03-22NeRF-GAN Distillation for Efficient 3D-Aware Generation with ConvolutionsMohamad Shahbazi et.al.2303.12865:mortar_board:Code
2023-03-22CLIP2^2: Contrastive Language-Image-Point Pretraining from Real-World Point Cloud DataYihan Zeng et.al.2303.12417:mortar_board:None
2023-03-21SALAD: Part-Level Latent Diffusion for 3D Shape Generation and ManipulationJuil Koo et.al.2303.12236:mortar_board:None
2023-03-21Vox-E: Text-guided Voxel Editing of 3D ObjectsEtai Sella et.al.2303.12048:mortar_board:None
2023-03-203D Concept Learning and Reasoning from Multi-View ImagesYining Hong et.al.2303.11327:mortar_board:None
2023-03-20Learning to Generate 3D Representations of Building Roofs Using Single-View Aerial ImageryMaxim Khomiakov et.al.2303.11215:mortar_board:None
2023-03-16NeRFMeshing: Distilling Neural Radiance Fields into Geometrically-Accurate 3D MeshesMarie-Julie Rakotosaona et.al.2303.09431:mortar_board:None
2023-03-16Mimic3D: Thriving 3D-Aware GANs via 3D-to-2D ImitationXingyu Chen et.al.2303.09036:mortar_board:None
2023-03-14MeshDiffusion: Score-based Generative 3D Mesh ModelingZhen Liu et.al.2303.08133:mortar_board:None
2023-03-12StereoTac: a Novel Visuotactile Sensor that Combines Tactile Sensing with 3D VisionEtienne Roberge et.al.2303.06542:mortar_board:None
2023-03-11FAC: 3D Representation Learning via Foreground Aware Feature ContrastKangcheng Liu et.al.2303.06388:mortar_board:Code
2023-03-093D Video Loops from Asynchronous InputLi Ma et.al.2303.05312:mortar_board:None
2023-03-08Neural Vector Fields: Implicit Representation by Explicit LearningXianghui Yang et.al.2303.04341:mortar_board:None
2023-03-03Multi-Plane Neural Radiance Fields for Novel View SynthesisYoussef Abdelkareem et.al.2303.01736:mortar_board:None
2023-02-28CLR-GAM: Contrastive Point Cloud Learning with Guided Augmentation and Feature MappingSrikanth Malla et.al.2302.14306:mortar_board:None
2023-02-27Joint-MAE: 2D-3D Joint Masked Autoencoders for 3D Point Cloud Pre-trainingZiyu Guo et.al.2302.14007:mortar_board:None
2023-02-26Makeup Extraction of 3D Representation via Illumination-Aware Image DecompositionXingchao Yang et.al.2302.13279:mortar_board:None
2023-02-16Spectral 3D Computer Vision – A ReviewYajie Sun et.al.2302.08054:mortar_board:None
2023-02-05Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative PretrainingZekun Qi et.al.2302.02318:mortar_board:Code
2023-01-27HyperNeRFGAN: Hypernetwork approach to 3D NeRF GANAdam Kania et.al.2301.11631:mortar_board:Code
2023-01-20Semi-analytical computation of heteroclinic connections between center manifolds with the parameterization methodMiquel Barcelona et.al.2301.08526:mortar_board:None
2023-01-18Joint Representation Learning for Text and 3D Point CloudRui Huang et.al.2301.07584:mortar_board:None
2023-01-18OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and GenerationTong Wu et.al.2301.07525:mortar_board:None
2023-01-18Three-dimensional reconstruction and characterization of bladder deformationsAugustin C. Ogier et.al.2301.07385:mortar_board:None
2023-01-12Self-Supervised Image-to-Point Distillation via Semantically Tolerant Contrastive LossAnas Mahmoud et.al.2301.05709:mortar_board:None
2023-01-10Neural Radiance Field CodebooksMatthew Wallingford et.al.2301.04101:mortar_board:None
2023-01-06Object as Query: Equipping Any 2D Object Detector with 3D Detection AbilityZitian Wang et.al.2301.02364:mortar_board:None
2022-12-18SPARF: Large-Scale Learning of 3D Sparse Radiance Fields from Few Input ImagesAbdullah Hamdi et.al.2212.09100:mortar_board:Code
2022-12-16Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?Runpei Dong et.al.2212.08320:mortar_board:Code
2022-12-13Structured 3D Features for Reconstructing Relightable and Animatable AvatarsEnric Corona et.al.2212.06820:mortar_board:None
2022-12-13Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked AutoencodersRenrui Zhang et.al.2212.06785:mortar_board:Code
2022-12-10ULIP: Learning Unified Representation of Language, Image and Point Cloud for 3D UnderstandingLe Xue et.al.2212.05171:mortar_board:Code
2022-12-09LoopDraw: a Loop-Based Autoregressive Model for Shape Synthesis and EditingNam Anh Dinh et.al.2212.04981:mortar_board:Code
2022-12-09Neural Volume Super-ResolutionYuval Bahat et.al.2212.04666:mortar_board:None
2022-12-023D-TOGO: Towards Text-Guided Cross-Category 3D Object GenerationZutao Jiang et.al.2212.01103:mortar_board:None
2022-12-01SparseFusion: Distilling View-conditioned Diffusion for 3D ReconstructionZhizhuo Zhou et.al.2212.00792:mortar_board:None
2022-11-24DiffusionSDF: Conditional Generative Modeling of Signed Distance FunctionsGene Chou et.al.2211.13757:mortar_board:None
2022-11-23Tetrahedral Diffusion Models for 3D Shape GenerationNikolai Kalischek et.al.2211.13220:mortar_board:None
2022-11-21Local-to-Global Registration for Bundle-Adjusting Neural Radiance FieldsYue Chen et.al.2211.11505:mortar_board:None
2022-11-21Next3D: Generative Neural Texture Rasterization for 3D-Aware Head AvatarsJingxiang Sun et.al.2211.11208:mortar_board:Code
2022-11-20IC3D: Image-Conditioned 3D Diffusion for Shape GenerationCristian Sbrolli et.al.2211.10865:mortar_board:None
2022-11-17RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and GenerationTitas Anciukevičius et.al.2211.09869:mortar_board:Code
2022-11-15ParticleGrid: Enabling Deep Learning using 3D Representation of MaterialsShehtab Zaman et.al.2211.08506:mortar_board:Code
2022-11-11Shock-accelerated electrons during the fast expansion of a coronal mass ejectionD. E. Morosan et.al.2211.06049:mortar_board:None
2022-11-09ChromoSkein: Untangling Three-Dimensional Chromatin Fiber With a Web-Based Visualization FrameworkMatúš Talčík et.al.2211.05125:mortar_board:None
2022-11-03Semantic 3D Grid Maps for Autonomous DrivingAjinkya Khoche et.al.2211.01700:mortar_board:Code
2022-10-31gCoRF: Generative Compositional Radiance FieldsMallikarjun BR et.al.2210.17344:mortar_board:None
2022-10-27Deep Generative Models on 3D Representations: A SurveyZifan Shi et.al.2210.15663:mortar_board:None
2022-10-26Analyzing Deep Learning Representations of Point Clouds for Real-Time In-Vehicle LiDAR PerceptionMarc Uecker et.al.2210.14612:mortar_board:None
2022-10-25MICP-L: Fast parallel simulative Range Sensor to Mesh registration for Robot LocalizationAlexander Mock et.al.2210.13904:mortar_board:Code
2022-10-24Learning Neural Radiance Fields from Multi-View GeometryMarco Orsingher et.al.2210.13041:mortar_board:None
2022-10-22NeuPhysics: Editable Neural Geometry and Physics from Monocular VideosYi-Ling Qiao et.al.2210.12352:mortar_board:None
2022-10-20Coordinates Are NOT Lonely – Codebook Prior Helps Implicit Neural 3D RepresentationsFukun Yin et.al.2210.11170:mortar_board:Code
2022-10-14Reference Based Color Transfer for Medical Volume RenderingSudarshan Devkota et.al.2210.08083:mortar_board:None
2022-10-13Visual Reinforcement Learning with Self-Supervised 3D RepresentationsYanjie Ze et.al.2210.07241:mortar_board:None
2022-10-12AniFaceGAN: Animatable 3D-Aware Face Image Generation for Video AvatarsYue Wu et.al.2210.06465:mortar_board:None
2022-10-06XDGAN: Multi-Modal 3D Shape Generation in 2D SpaceHassan Abu Alhaija et.al.2210.03007:mortar_board:None
2022-10-05Water Simulation and Rendering from a Still PhotographRyusuke Sugimoto et.al.2210.02553:mortar_board:None
2022-10-04Bridged Transformer for Vision and Point Cloud 3D Object DetectionYikai Wang et.al.2210.01391:mortar_board:None
2022-08-14Widely Used and Fast De Novo Drug Design by a Protein Sequence-Based Reinforcement Learning ModelYaqin Li et.al.2209.07405:mortar_board:None
2022-09-07Multi-NeuS: 3D Head Portraits from Single Image with Neural Implicit FunctionsEgor Burkov et.al.2209.04436:mortar_board:None
2022-09-09Towards Confidence-guided Shape Completion for Robotic ApplicationsAndrea Rosasco et.al.2209.04300:mortar_board:Code
2022-08-30Inferring Implicit 3D Representations from Human Figures on Pictorial MapsRaimund Schnürer et.al.2209.02385:mortar_board:None
2022-08-24PeRFception: Perception using Radiance FieldsYoonwoo Jeong et.al.2208.11537:mortar_board:Code
2022-08-23Spiral Contrastive Learning: An Efficient 3D Representation Learning Method for Unannotated CT LesionsPenghua Zhai et.al.2208.10694:mortar_board:None
2022-08-19Temporal View Synthesis of Dynamic Scenes through 3D Object Motion Estimation with Multi-Plane ImagesNagabhushan Somraj et.al.2208.09463:mortar_board:None
2022-08-083D Vision with Transformers: A SurveyJean Lahoud et.al.2208.04309:mortar_board:Code
2022-08-03Vision-Based Safety System for Barrierless Human-Robot CollaborationLina María Amaya-Mejía et.al.2208.02010:mortar_board:None
2022-08-02Self-Supervised Traversability Prediction by Learning to Reconstruct Safe TerrainRobin Schmid et.al.2208.01329:mortar_board:None
2022-07-29Neural Density-Distance FieldsItsuki Ueda et.al.2207.14455:mortar_board:Code
2022-07-26ProposalContrast: Unsupervised Pre-training for LiDAR-based 3D Object DetectionJunbo Yin et.al.2207.12654:mortar_board:Code
2022-07-24Cross-Modal 3D Shape Generation and ManipulationZezhou Cheng et.al.2207.11795:mortar_board:None
2022-07-22Seeing 3D Objects in a Single Image via Self-Supervised Static-Dynamic DisentanglementPrafull Sharma et.al.2207.11232:mortar_board:None
2022-07-21Approximate Differentiable Rendering with Algebraic SurfacesLeonid Keselman et.al.2207.10606:mortar_board:None
2022-07-18Latent Partition Implicit with Surface Codes for 3D RepresentationChao Chen et.al.2207.08631:mortar_board:Code
2022-07-16Consistency of Implicit and Explicit Features Matters for Monocular 3D Object DetectionQian Ye et.al.2207.07933:mortar_board:None
2022-07-133D Concept Grounding on Neural FieldsYining Hong et.al.2207.06403:mortar_board:None
2022-07-12Vision Transformer for NeRF-Based View Synthesis from a Single Input ImageKai-En Lin et.al.2207.05736:mortar_board:None
2022-07-06VMRF: View Matching Neural Radiance FieldsJiahui Zhang et.al.2207.02621:mortar_board:None
2022-06-23EventNeRF: Neural Radiance Fields from a Single Colour Event CameraViktor Rudnev et.al.2206.11896:mortar_board:None
2022-06-22KiloNeuS: Implicit Neural Representations with Real-Time Global IlluminationStefano Esposito et.al.2206.10885:mortar_board:None
2022-06-20WiFi-based Spatiotemporal Human Action PerceptionYanling Hao et.al.2206.09867:mortar_board:None
2022-06-13AR-NeRF: Unsupervised Learning of Depth and Defocus Effects from Natural Images with Aperture Rendering Neural Radiance FieldsTakuhiro Kaneko et.al.2206.06100:mortar_board:None
2022-06-12NeuralODF: Learning Omnidirectional Distance Fields for 3D Shape RepresentationTrevor Houchens et.al.2206.05837:mortar_board:None
2022-06-09Beyond RGB: Scene-Property Synthesis with Neural Radiance FieldsMingtong Zhang et.al.2206.04669:mortar_board:None
2022-06-08Learning Ego 3D Representation as Ray TracingJiachen Lu et.al.2206.04042:mortar_board:Code
2022-06-08CO^3: Cooperative Unsupervised 3D Representation Learning for Autonomous DrivingRunjian Chen et.al.2206.04028:mortar_board:None
2022-06-02Machine Learning for Detection of 3D Features using sparse X-ray dataBradley T. Wolfe et.al.2206.02564:mortar_board:None
2022-06-05FOF: Learning Fourier Occupancy Field for Monocular Real-time Human ReconstructionQiao Feng et.al.2206.02194:mortar_board:None
2022-05-30Neural Volumetric Object SelectionZhongzheng Ren et.al.2205.14929:mortar_board:None
2022-05-28Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-trainingRenrui Zhang et.al.2205.14401:mortar_board:Code
2022-05-25sat2pc: Estimating Point Cloud of Building Roofs from 2D Satellite ImagesYoones Rezaei et.al.2205.12464:mortar_board:None

My Arxiv Daily
http://baiyucraft.top/Arxiv/Arxiv-daily.html
作者
baiyucraft
发布于
2023年10月25日
许可协议