Computer Science

Academic Log | October-December 2022

A collection of academic papers/blogs/talks/projects that I read/watched/explored during the month. I also include any small (or large) personal projects that I did and any such related ML/non-ML work. Personal Projects Paper re-implementation - Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability by Cohen et al., 2021 - [Github] Paper re-implementation - The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks by Frankle et al., 2018 - [Github] Paper re-implementation -An Empirical Model of Large-Batch Training by OpenAI, 2018 - [Github] Annotated Papers The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability Modeling Language Usage and Listener Engagement in Podcasts Which Algorithmic Choices Matter at Which Batch Sizes?

Academic Log | August/September 2022

A collection of academic papers/blogs/talks/projects that I read/watched/explored during the month. I also include any small (or large) personal projects that I did and any such related ML/non-ML work. Personal Projects VAE-Implementation - A simple implementation of Autoencoder and Variational Autoencoder - [Github] MinHash-Implemenation - A simple MinHash implementation based on the explanation in the Mining of Massive Datasets course by Stanford - [Github] Paper re-implementation - Sentence VAE paper, “Generating Sentences from a Continuous Space” by Bowman et al.

Academic Log | June/July 2022

A collection of academic papers/blogs/talks/projects that I read/watched/explored during the month. I also include any small (or large) personal projects that I did and any such related ML/non-ML work. Personal Projects Paper re-implementation - “Extracting Training Data from Large Language Models” by Carlini et al., 2021. - [Github] Annotated Papers Learning Backward Compatible Embeddings Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models Tracing Knowledge in Language Models Back to the Training Data Papers I read On the Unreasonable Effectiveness of Feature propagation in Learning on Graphs with Missing Node Features PaLM: Scaling Language Modeling with Pathways Hierarchical Text-Conditional Image Generation with CLIP Latents Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language Unified Contrastive Learning in Image-Text-Label Space Improving Passage Retrieval with Zero-Shot Question Generation Exploring Dual Encoder Architectures for Question Answering Efficient Fine-Tuning of BERT Models on the Edge Fine-Tuning Transformers: Vocabulary Transfer Manipulating SGD with Data Ordering Attacks Differentially Private Fine-tuning of Language Models Extracting Training Data from Large Language Models Learning Backward Compatible Embeddings Compacter: Efficient Low-Rank Hypercomplex Adapter Layers Agreement-on-the-Line: Predicting the Performance of Neural Networks under Distribution Shift Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models Tracing Knowledge in Language Models Back to the Training Data Blogs I read Domain Adaptation with Generative Pseudo-Labeling (GPL) Making Deep Learning Go Brrrr From First Principles Introduction to TorchScript Nonlinear Computation in Deep Linear Networks Talks I watched How GPU Computing Works   Subscribe to my posts!