Shreyansh Singh
  • About
  • Posts
  • Projects
  • Publications
  • Bookshelf
  • CV
  • mlsys
  • •

  • transformer
  • •

  • paper-summaries
  • •

  • MLSys
  • •

  • LLMs
  • •

  • PPML
  • Faster Cross-Encoder Inference: Unleashing torch.compile for speed

    A quick writeup on accelerating a Jina Cross-Encoder using torch.compile

    20 min read   ·   March 02, 2025

    2025   ·   inference-optimization   efficiency   mlsys   ·   MLSys

    Faster Cross-Encoder Inference: Unleashing torch.compile for speed
  • Paper Summary #13 - Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process

    My notes from the Physics of Language Models series of papers.

    27 min read   ·   September 21, 2024

    2024   ·   transformer   reasoning   paper-summaries   ·   LLMs

    Paper Summary #13 - Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process
  • Paper Summary #12 - Image Recaptioning in DALL-E 3

    The image recaptioning technique used in DALL-E 3 was extended to videos in Sora.

    12 min read   ·   February 18, 2024

    2024   ·   image-captioning   generative-ai   ·   Computer Vision

    Paper Summary #12 - Image Recaptioning in DALL-E 3
  • Paper Summary #11 - Sora

    OpenAI announced a ground-breaking text-to-video diffusion model capable of generating high-definition videos up to 60 seconds long.

    7 min read   ·   February 18, 2024

    2024   ·   diffusion   image-generation   video-generation   generative-ai   ·   Computer Vision

    Paper Summary #11 - Sora
  • Paper Summary #10 - Gemini 1.5 Pro

    Google DeepMind announced a multimodal LLM with support of up to 10M context length.

    19 min read   ·   February 18, 2024

    2024   ·   llm   multimodal   transformer   ·   LLMs

    Paper Summary #10 - Gemini 1.5 Pro
  • Solving Substitution Ciphers using Markov Chain Monte Carlo (MCMC)

    Deciphering substitution ciphers can be framed as a Markov chain problem and a simple Monte Carlo sampling approach can help solve them very efficiently

    5 min read   ·   July 23, 2023

    2023   ·   sampling   probability   mcmc   cryptography   ·   Mathematics

    Solving Substitution Ciphers using Markov Chain Monte Carlo (MCMC)
  • Paper Summary #9 - Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training

    Understanding Sophia - A new fast, scalable second-order optimizer which beats Adam on LLM pretraining.

    31 min read   ·   May 28, 2023

    2023   ·   transformer   optimizer   deep-learning   paper-summaries   ·   Deep Learning   ML Theory

    Paper Summary #9 - Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
  • Paper Summary #8 - FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

    Understanding FlashAttention which is the most efficient exact attention implementation out there, which optimizes for both memory requirements and wall-clock time.

    37 min read   ·   March 26, 2023

    2023   ·   mlsys   transformer   efficiency   attention   paper-summaries   ·   MLSys

    Paper Summary #8 - FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
  • Paper Summary #7 - Efficient Transformers: A Survey

    A survey paper of improvements over the original Transformer architecture in terms of memory-efficiency.

    24 min read   ·   October 10, 2022

    2022   ·   mlsys   transformer   efficiency   attention   paper-summaries   ·   MLSys   LLMs

    Paper Summary #7 - Efficient Transformers: A Survey
  • Deploying Machine Learning models using GCP's Google AI Platform - A Detailed Tutorial

    A step-wise tutorial to demonstrate the steps required to deploy a ML model using GCP, specifically the Google AI Platform and use Streamlit to access the model through a UI.

    21 min read   ·   March 06, 2022

    2022   ·   model-deployment   gcp   streamlit   ·   MLOps

    Deploying Machine Learning models using GCP's Google AI Platform - A Detailed Tutorial
  • Newer
  • 1
  • 2
  • 3
  • 4
  • 5
  • Older
© Copyright 2026 Shreyansh Singh.