News|Research

FMGS: Gaussian Splatting with Semantics

Michael Rubloff

Jan 5, 2024

FMGS
FMGS

It's no secret at all that I am a massive fan of pairing radiance fields with semantics. I've written about it multiple times and always get excited when I see a new method. This brings me to last night when Foundation Model Embedded Gaussian Splatting (FMGS) came across my screen.

For those not familiar, semantic paired methods literally allow you to ask a radiance field a question about what's contained inside it. It can be a simple question like, where did I leave my olive oil in the kitchen, or what screw do I use first to assemble this desk I got from IKEA?

There was actually another Gaussian Splatting linked method just last week named LangSplat, but this newest method also caught my attention. LangSplat offers a 200X speed up compared to LERF, but FMGS...FMGS is 800X faster than LERF! They're able to achieve 103.4 fps, while it's running. Admittedly, the authors have not released a lot of media or examples that I can show, but I greatly would like to.

Returning to the earlier paper of LangSplat, it seems to be strong at utilizing SAM; I am super curious how the two new papers could each be leveraged on top of one another. FMGS stands out by its integration of vision-language embeddings from foundation models directly into the 3D scene representation, merging visual and linguistic data effectively. On the other hand, LangSplat takes a slightly different approach, focusing on constructing a 3D language field by enhancing each Gaussian with language embeddings distilled from CLIP and utilizing a tile-based splatting technique for rendering language features.

How is FMGS getting such a ridiculous speed boost? It feels like so much of the work people do can be tied back to NVIDIA's Multi Resolution Hash Encoding, or Instant NGP. They're not actually using Instant NGP, because that is NeRF based, but they direct inspiration from it. In FMGS, this speed boost is achieved through the innovative integration of multi-resolution hash encoding, enhancing the efficiency of the framework.

The distinguishing feature of FMGS is its integration of vision-language embeddings from foundation models. These embeddings are incorporated into the 3D scene representation, enabling the model to understand and interpret the semantic content within the scene. In practice, this involves distilling feature maps generated from image-based foundation models and rendering them from the 3D GS model, effectively merging visual and linguistic data.

While we've seen various efforts aimed at optimizing Gaussian splatting, FMGS introduces a unique solution to the challenge. To navigate the memory and computational constraints often encountered, FMGS leverages a Multi-Resolution Hash Encoding (MHE). This method works in tandem with Gaussian Splatting, enhancing its ability to efficiently represent complex language content within 3D scenes.

This component uses hash tables at multiple resolutions, reducing the computational load while maintaining the quality of the semantic embeddings. A key innovation in FMGS is the introduction of a pixel alignment loss. This component ensures that the rendered feature distance of semantically similar entities is minimized, adhering to pixel-level semantic boundaries. This aspect of FMGS contributes to the framework's ability to provide high-quality rendering and fast training, crucial for practical applications.

FMGS employs a unique training procedure that involves supervising the MHE-based language feature field using a hybrid feature map. This map is derived from multi-scale image crops obtained from various viewpoints. The training process ensures that the language embeddings capture relevant features at each scale, allowing for a comprehensive representation of the scene.

For querying, FMGS allows users to interact with the 3D scene using natural language. The model generates relevancy maps based on the query, highlighting semantically relevant parts of the scene.

Unlike traditional methods that focus either on geometric accuracy or semantic understanding, FMGS excels in both. It provides a more holistic understanding of the scene by integrating detailed geometry with rich semantic context. Additionally, FMGS demonstrates a significant improvement in inference speed and versatility compared to other state-of-the-art methods.

FMGS opens up a plethora of possibilities in augmented reality and robotics. In AR, it can enhance user experiences by providing more accurate and interactive representations of physical spaces. In robotics, FMGS can be instrumental in developing robots that understand and navigate spaces more effectively, recognizing objects not just by their shape but also by their semantic properties.

Funnily enough, in order to not go insane out of boredom in the days between Christmas and New Years, I had a long phone call with a friend who it it click for him how many opportunities there are for this. Some of the ones we spoke about was hospital and patient SOP management, evacuation and simulation methods, and a grocery store automating inventory. Not far from that, some of my personal favorite are in the agricultural space.

Given that FMGS comes out of Google, I have to imagine how they might be thinking about it benefiting search. I would be curious to see how a user of Google Maps might be using FMGS. My thought on more everyday uses, such as asking, where is the bathroom in this coffee shop?

Think about all the possibilities of what you can do with radiance fields paired with semantics. What do you think? How would you use a radiance field that can highlight what's contained in it?

Their authors have also stated that they will be releasing their code after the paper has been accepted.

Featured

Featured

Featured

Platforms

OpenSplat adds Mac GPU Acceleration

OpenSplat, which brought Mac training to 3DGS has received a big update, now allowing users to train with MPS backend with GPU acceleration.

Michael Rubloff

Apr 15, 2024

Platforms

OpenSplat adds Mac GPU Acceleration

OpenSplat, which brought Mac training to 3DGS has received a big update, now allowing users to train with MPS backend with GPU acceleration.

Michael Rubloff

Apr 15, 2024

Platforms

OpenSplat adds Mac GPU Acceleration

OpenSplat, which brought Mac training to 3DGS has received a big update, now allowing users to train with MPS backend with GPU acceleration.

Michael Rubloff

Research

Shrinking 3DGS File Size

Gaussian Splatting has quickly become one of the most exciting research topics in Radiance Fields, thanks to its fast training, real time rendering rates, and easy to create pipeline. The one critique that emerged was the resulting file size from captures, often venturing into the high hundreds of megabytes and up.

Michael Rubloff

Apr 11, 2024

Research

Shrinking 3DGS File Size

Gaussian Splatting has quickly become one of the most exciting research topics in Radiance Fields, thanks to its fast training, real time rendering rates, and easy to create pipeline. The one critique that emerged was the resulting file size from captures, often venturing into the high hundreds of megabytes and up.

Michael Rubloff

Apr 11, 2024

Research

Shrinking 3DGS File Size

Gaussian Splatting has quickly become one of the most exciting research topics in Radiance Fields, thanks to its fast training, real time rendering rates, and easy to create pipeline. The one critique that emerged was the resulting file size from captures, often venturing into the high hundreds of megabytes and up.

Michael Rubloff

Platforms

Luma AI Android Released

Native Android support from Luma AI is finally here. Of all the questions about Luma features I get, Android support is routinely at the top of the list.

Michael Rubloff

Apr 10, 2024

Platforms

Luma AI Android Released

Native Android support from Luma AI is finally here. Of all the questions about Luma features I get, Android support is routinely at the top of the list.

Michael Rubloff

Apr 10, 2024

Platforms

Luma AI Android Released

Native Android support from Luma AI is finally here. Of all the questions about Luma features I get, Android support is routinely at the top of the list.

Michael Rubloff

Research

PhysAvatar's Dynamic Dances

Playing as yourself in a video game has always seemed like a fun idea. Now, we're one step closer to making that a reality with PhysAvatar.

Michael Rubloff

Apr 9, 2024

Research

PhysAvatar's Dynamic Dances

Playing as yourself in a video game has always seemed like a fun idea. Now, we're one step closer to making that a reality with PhysAvatar.

Michael Rubloff

Apr 9, 2024

Research

PhysAvatar's Dynamic Dances

Playing as yourself in a video game has always seemed like a fun idea. Now, we're one step closer to making that a reality with PhysAvatar.

Michael Rubloff

Trending articles

Trending articles

Trending articles

Tools

splaTV: Dynamic Gaussian Splatting Viewer

Kevin Kwok, perhaps better known as Antimatter15, has released something amazing: splaTV.

Michael Rubloff

Mar 15, 2024

Tools

splaTV: Dynamic Gaussian Splatting Viewer

Kevin Kwok, perhaps better known as Antimatter15, has released something amazing: splaTV.

Michael Rubloff

Mar 15, 2024

Tools

splaTV: Dynamic Gaussian Splatting Viewer

Kevin Kwok, perhaps better known as Antimatter15, has released something amazing: splaTV.

Michael Rubloff

Research

Live NeRF Video Calls

Catching up with my sister has been an exercise in bridging distances. She recently moved to Copenhagen, trading the familiar landscapes of our shared childhood for the charming streets of the Danish capital.

Michael Rubloff

Oct 5, 2023

Research

Live NeRF Video Calls

Catching up with my sister has been an exercise in bridging distances. She recently moved to Copenhagen, trading the familiar landscapes of our shared childhood for the charming streets of the Danish capital.

Michael Rubloff

Oct 5, 2023

Research

Live NeRF Video Calls

Catching up with my sister has been an exercise in bridging distances. She recently moved to Copenhagen, trading the familiar landscapes of our shared childhood for the charming streets of the Danish capital.

Michael Rubloff

Guest Article

A short 170 year history of Neural Radiance Fields (NeRF), Holograms, and Light Fields

Lightfield and hologram capture started with a big theoretical idea 115 years ago and we have struggled to make them viable ever since. Neural Radiance fields aka NeRF along with gaming computers now for the first time provide a promising easy and low cost way for everybody to capture and display lightfields.

Katrin Schmid

Mar 2, 2023

Guest Article

A short 170 year history of Neural Radiance Fields (NeRF), Holograms, and Light Fields

Lightfield and hologram capture started with a big theoretical idea 115 years ago and we have struggled to make them viable ever since. Neural Radiance fields aka NeRF along with gaming computers now for the first time provide a promising easy and low cost way for everybody to capture and display lightfields.

Katrin Schmid

Mar 2, 2023

Guest Article

A short 170 year history of Neural Radiance Fields (NeRF), Holograms, and Light Fields

Lightfield and hologram capture started with a big theoretical idea 115 years ago and we have struggled to make them viable ever since. Neural Radiance fields aka NeRF along with gaming computers now for the first time provide a promising easy and low cost way for everybody to capture and display lightfields.

Katrin Schmid

Featured

Featured

Tools

splaTV: Dynamic Gaussian Splatting Viewer

Kevin Kwok, perhaps better known as Antimatter15, has released something amazing: splaTV.

Michael Rubloff

Mar 15, 2024

SplaTV

Tools

splaTV: Dynamic Gaussian Splatting Viewer

Kevin Kwok, perhaps better known as Antimatter15, has released something amazing: splaTV.

Michael Rubloff

Mar 15, 2024

SplaTV

Tools

splaTV: Dynamic Gaussian Splatting Viewer

Michael Rubloff

Mar 15, 2024

SplaTV

Research

Live NeRF Video Calls

Catching up with my sister has been an exercise in bridging distances. She recently moved to Copenhagen, trading the familiar landscapes of our shared childhood for the charming streets of the Danish capital.

Michael Rubloff

Oct 5, 2023

Radiance Field Video Call

Research

Live NeRF Video Calls

Catching up with my sister has been an exercise in bridging distances. She recently moved to Copenhagen, trading the familiar landscapes of our shared childhood for the charming streets of the Danish capital.

Michael Rubloff

Oct 5, 2023

Radiance Field Video Call

Research

Live NeRF Video Calls

Michael Rubloff

Oct 5, 2023

Radiance Field Video Call

Guest Article

A short 170 year history of Neural Radiance Fields (NeRF), Holograms, and Light Fields

Lightfield and hologram capture started with a big theoretical idea 115 years ago and we have struggled to make them viable ever since. Neural Radiance fields aka NeRF along with gaming computers now for the first time provide a promising easy and low cost way for everybody to capture and display lightfields.

Katrin Schmid

Mar 2, 2023

History of Neural Radiance Fields

Guest Article

A short 170 year history of Neural Radiance Fields (NeRF), Holograms, and Light Fields

Lightfield and hologram capture started with a big theoretical idea 115 years ago and we have struggled to make them viable ever since. Neural Radiance fields aka NeRF along with gaming computers now for the first time provide a promising easy and low cost way for everybody to capture and display lightfields.

Katrin Schmid

Mar 2, 2023

History of Neural Radiance Fields

Guest Article

A short 170 year history of Neural Radiance Fields (NeRF), Holograms, and Light Fields

Katrin Schmid

Mar 2, 2023

History of Neural Radiance Fields

Recent articles

Recent articles

Platforms

OpenSplat adds Mac GPU Acceleration

OpenSplat, which brought Mac training to 3DGS has received a big update, now allowing users to train with MPS backend with GPU acceleration.

Michael Rubloff

Apr 15, 2024

OpenSplat

Platforms

OpenSplat adds Mac GPU Acceleration

OpenSplat, which brought Mac training to 3DGS has received a big update, now allowing users to train with MPS backend with GPU acceleration.

Michael Rubloff

Apr 15, 2024

OpenSplat

Research

Shrinking 3DGS File Size

Gaussian Splatting has quickly become one of the most exciting research topics in Radiance Fields, thanks to its fast training, real time rendering rates, and easy to create pipeline. The one critique that emerged was the resulting file size from captures, often venturing into the high hundreds of megabytes and up.

Michael Rubloff

Apr 11, 2024

3dgs compress

Research

Shrinking 3DGS File Size

Gaussian Splatting has quickly become one of the most exciting research topics in Radiance Fields, thanks to its fast training, real time rendering rates, and easy to create pipeline. The one critique that emerged was the resulting file size from captures, often venturing into the high hundreds of megabytes and up.

Michael Rubloff

Apr 11, 2024

3dgs compress

Platforms

Luma AI Android Released

Native Android support from Luma AI is finally here. Of all the questions about Luma features I get, Android support is routinely at the top of the list.

Michael Rubloff

Apr 10, 2024

Platforms

Luma AI Android Released

Native Android support from Luma AI is finally here. Of all the questions about Luma features I get, Android support is routinely at the top of the list.

Michael Rubloff

Apr 10, 2024