Academic Log | October-December 2022
A collection of academic papers/blogs/talks/projects that I read/watched/explored during the month. I also include any small (or large) personal projects that I did and any such related ML/non-ML work.
Personal Projects
- Paper re-implementation - Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability by Cohen et al., 2021 - [Github]
- Paper re-implementation - The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks by Frankle et al., 2018 - [Github]
- Paper re-implementation -An Empirical Model of Large-Batch Training by OpenAI, 2018 - [Github]
Annotated Papers
- The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
- Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability
- Modeling Language Usage and Listener Engagement in Podcasts
- Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model
- An Empirical Model of Large-Batch Training
- Fine-Tuning Language Models from Human Preferences
- Training language models to follow instructions with human feedback
- Adam: A Method for Stochastic Optimization
- Monolith: Real Time Recommendation System With Collisionless Embedding Table
- Limitations of the NTK for Understanding Generalization in Deep Learning
- What can linearized neural networks actually say about generalization?
- GLM-130B: An Open Bilingual Pre-trained Model
- Augmenting Netflix Search with In-Session Adapted Recommendations
- Adversary or Friend? An adversarial Approach to Improving Recommender Systems
Papers I read (in addition to above)
- How does GPT Obtain its Ability? Tracing Emergent Abilities of Language Models to their Sources
- Artificial Interrogation for Attributing Language Models
- Deep Learning on a Data Diet: Finding Important Examples Early in Training
- BERT on a Data Diet: Finding Important Examples by Gradient-Based Pruning
- Nonuniform Negative Sampling and Log Odds Correction with Rare Events Data
- Confidence-Ranked Reconstruction of Census Microdata from Published Statistics
- How Optimal is Greedy Decoding for Extractive Question Answering?
- The Curious Case of Absolute Position Embeddings
- Finding the smallest or largest element of a tensor from its low-rank factors
- GreaseLM: Graph REASoning Enhanced Language Models for Question Answering
- QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering
- Red-Teaming the Stable Diffusion Safety Filter
- Snorkel DryBell: A Case Study in Deploying Weak Supervision at Industrial Scale
Blogs I read
- Notifications: why less is more — how Facebook has been increasing both user satisfaction and app usage by sending only a few notifications | by Analytics at Meta | Dec, 2022 | Medium
- Feature Engineering for Personalized Search (fennel.ai)
- Are brain implants the future of computing? - YouTube
- A New Way to Achieve Nuclear Fusion: Helion
- The secret lives of MI6’s top female spies
- Re-examining LayerNorm + Code
- Mysteries of Mode Collapse due to RLHF
- Will we run out of ML data evidence from projecting dataset
- Generating Human-level Text with Contrastive Search in Transformers
- How I learn machine learning | ★❤✰ Vicki Boykis ★❤✰
- Emerging Research & Applications of Large Language Models (w/ Google Brain, Replit, & HuggingFace) - YouTube
- https://github.com/srush/GPU-Puzzles
- How Pinterest Leverages Realtime User Actions in Recommendation to Boost Homefeed Engagement Volume | by Pinterest Engineering | Pinterest Engineering Blog | Nov, 2022 | Medium
- (1) Real-Time Research Recording: Can a Transformer Re-Derive Positional Info? - YouTube
- Neural Tangent Kernel Distillation - LessWrong
- Generating Human-level Text with Contrastive Search in Transformers
- Ethan Caballero–Broken Neural Scaling Laws - YouTube
- Reducing Instagram’s basic video compute time by 94 percent (fb.com)
- Real-Time Research Recording: Can a Transformer Re-Derive Positional Info? - YouTube
- NTK
- Gaussian Processes
Courses
- Revisited some of the lectures of Advanced NLP (Fall’22) having completed the Fall’21 set of lectures in early 2022.