AndyBlocker
RSS FeedRecent Posts
QKFormer: Hierarchical Spiking Transformer using Q-K Attention
Published: at 18:09QKFormer,NIPS2024 Spotlight,把Direct Training SNN在ImageNet和CIFAR上的点刷的特别高,感觉之后要做就避不开它。
Transformers without Normalization
Published: at 16:09何恺明新作,用DyT代替Norm,把同步操作变成了Element Wise的操作。新文章里面有用到,学习一下。
Visualizing and Understanding the Effectiveness of BERT
Published: at 10:21最近做SNN训练的过程中在研究怎么可视化训练过程中的Loss,在想新加入的方法会不会对模型的Loss Landscape有影响,一般讲Loss Landscape怎么做可视化的文章都会引用这篇文章对Loss Landscape的分析和做法。
One-Minute Video Generation with Test-Time Training
Published: at 18:17最近Demo很火的TTT视频生成,可以生成60s级别的长视频。学习一下TTT的东西,SNN的On-Chip Learning和TTT能不能做结合?
Evolution Strategies as a Scalable Alternative to Reinforcement Learning
Published: at 16:40这两天在弄SNN训练的事情,需要验证一下用的Surrogate Gradient的准确性,老师介绍读一下这篇文章,用Evolution Strategy验证一下现在梯度估计的准确性。
SparTA: Deep-Learning Model Sparsity via Tensor-with-Sparsity-Attribute
Published: at 11:06sparTA,带稀疏优化的DNN编译器,把tensor的稀疏性作为一种重要属性考虑到编译过程中,生成高效的代码。
Scalable Diffusion Models with Transformers
Published: at 16:29Diffusion Transformer.
初探AI Infra
Updated: at 18:30Published: at 16:04趁最近找实习的机会学习、总结一下之前零散接触过的模型推理/训练加速的知识,还有一些CUDA编程的体系架构之类的内容。
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Updated: at 14:57Published: at 14:39使用大kernel DS卷积替代self-attention。字节新加坡的工作。
SpikeCV: Open a Continuous Computer Vision Era
Updated: at 14:57Published: at 15:33事件相机开源框架。