What is the FID?

Published at

2025/03/06

Last edited time

2025/03/06 12:08

Created

2025/03/06 05:16

Section

Status

Done

Series

What is FID?

Background

•

IS (Inception Score)

◦

The Inception score has the disadvantage that it does not use the statistics of real world samples and compare it to the statistics of synthetic samples.

◦

It means that the Inception Score (IS) has a key limitation; it evaluates the quality of generated samples only based on their predicted class probabilities, without comparing them to real-world samples. This means it does not directly measure how well the generated data matches the distribution of real images.

FID (Frechet Inception Distance)

arxiv.org

https://arxiv.org/pdf/1706.08500

•

First introduced by GANsTrained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium

•

How FID Addresses the issue of IS?

◦

The Fréchet Inception Distance (FID) improves upon IS by comparing the feature distributions of real and generated images. It assumes that features extracted from a deep network (e.g., an Inception model) follow a multivariate Gaussian distribution:

Compute the mean and covariance of the feature representations for real and generated images

•

μr,Cr\mu_r, C_rμr​,Cr​ = mean and covariance of real  images

•

μg,Cg\mu_g, C_gμg​,Cg​ = mean and covariance of generated images

Compute the Fréchet distance between these two Gaussian distributions

d^2 \left( (\mathbf{\mu_r}, \mathbf{C_r}), (\mathbf{\mu_g}, \mathbf{C}_g) \right) =\|\mathbf{\mu_r} - \mathbf{\mu_g}\|_2^2 +\text{Tr} \left( \mathbf{C_r} + \mathbf{C_g} - 2 (\mathbf{C_r} \mathbf{C_g})^{1/2} \right).

In Code

import numpy as np
import torch
import torchvision.transforms as transforms
from torchvision.models import inception_v3  # Feature Extractor 모델
from scipy.linalg import sqrtm
from PIL import Image

# 장치 설정 (CUDA 사용 가능하면 GPU, 그렇지 않으면 CPU 사용)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# InceptionV3 모델 로드 (Feature Extractor로 사용)
feature_Extracter = inception_v3(pretrained=True, aux_logits=True).to(device)

def get_inception_features(images, model, device):
    """ 
    주어진 이미지 목록에서 InceptionV3 모델을 사용하여 피처(특징 벡터)를 추출하는 함수
    """
    model.eval()  # 모델을 평가 모드로 설정 (추론용)
    features_list = []  # 피처 저장을 위한 리스트

    def hook_fn(module, input, output):
        """ 
        InceptionV3 모델의 avgpool 레이어 출력을 후킹하여 피처를 저장하는 함수
        """
        features_list.append(output.detach().cpu().numpy())  # 출력값을 리스트에 저장

    model.avgpool.register_forward_hook(hook_fn)  # avgpool 레이어의 출력을 후킹

    # 이미지 전처리 변환 정의
    transform = transforms.Compose([
        transforms.Resize((299, 299)),  # InceptionV3 입력 크기로 조정
        transforms.ToTensor(),  # 텐서 변환
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])  # 정규화
    ])

    with torch.no_grad():  # 그래디언트 계산 비활성화 (추론 속도 향상)
        for img in images:
            img = img.convert('RGB')  # RGB 모드 변환
            img = transform(img).unsqueeze(0).to(device)  # 배치 차원 추가 후 장치로 이동
            feature = model(img)  # 모델을 통해 특징 벡터 추출

    np_features = np.concatenate(features_list, axis=0)  # 리스트에 저장된 피처를 하나의 배열로 결합
    return np_features  # 추출된 특징 벡터 반환

def convert_2048(np_value):
    """ 
    InceptionV3에서 추출된 특징 벡터를 (n, 2048, 1, 1) -> (n, 2048)로 변환하는 함수
    """
    return np_value.reshape(np_value.shape[0], 2048)  # 불필요한 차원 제거

def calculate_fid(real_features, fake_features):
    """ 
    FID (Fréchet Inception Distance) 점수를 계산하는 함수
    """
    real_features = np.array(real_features)  # 실 이미지 특징 배열로 변환
    real_features = convert_2048(real_features)  # 2048차원으로 변환

    fake_features = np.array(fake_features)  # 생성된 이미지 특징 배열로 변환
    fake_features = convert_2048(fake_features)  # 2048차원으로 변환

    # 평균 및 공분산 계산
    mu1, sigma1 = real_features.mean(axis=0), np.cov(real_features, rowvar=False)
    mu2, sigma2 = fake_features.mean(axis=0), np.cov(fake_features, rowvar=False)

    diff = mu1 - mu2  # 평균 벡터 차이
    covmean = sqrtm(sigma1.dot(sigma2))  # 공분산 행렬의 루트 계산

    # NaN 처리 (복소수가 발생할 경우 실수 부분만 유지)
    if np.iscomplexobj(covmean):
        covmean = covmean.real

    fid = diff.dot(diff) + np.trace(sigma1 + sigma2 - 2 * covmean)  # FID 공식 적용

    return float(fid)  # FID 점수 반환
Python
복사