在 Python 中使用标准化剪切 (NCut) 进行无监督图像分割的指南

php中文网 2024-10-15 11:21:52

介绍

图像分割在理解和分析视觉数据方面起着至关重要的作用，而归一化剪切（ncut）是一种广泛使用的基于图的分割方法。在本文中，我们将探索如何使用 microsoft research 的数据集在 python 中应用 ncut 进行无监督图像分割，重点是使用超像素提高分割质量。
数据集概述
用于此任务的数据集可以从以下链接下载：msrc 对象类别图像数据库。该数据集包含原始图像及其语义分割为九个对象类（由以“_gt”结尾的图像文件表示）。这些图像被分组为主题子集，其中文件名中的第一个数字指的是类别子集。该数据集非常适合试验分割任务。

问题陈述

我们使用 ncut 算法对数据集中的图像进行图像分割。像素级分割的计算成本很高，而且通常有噪声。为了克服这个问题，我们使用 slic（简单线性迭代聚类）来生成超像素，它将相似的像素分组并减少问题大小。为了评估分割的准确性，可以使用不同的指标（例如，并集交集、ssim、兰德指数）。

执行

1。安装所需的库
我们使用 skimage 进行图像处理，使用 numpy 进行数值计算，使用 matplotlib 进行可视化。

pip install numpy matplotlib
pip install scikit-image==0.24.0
**2. load and preprocess the dataset**

下载并提取数据集后，加载图像和地面实况分割：

wget http://download.microsoft.com/download/a/1/1/a116cd80-5b79-407e-b5ce-3d5c6ed8b0d5/msrc_objcategimagedatabase_v1.zip -o msrc_objcategimagedatabase_v1.zip
unzip msrc_objcategimagedatabase_v1.zip
rm msrc_objcategimagedatabase_v1.zip

现在我们准备开始编码了。

from skimage import io, segmentation, color, measure
from skimage import graph
import numpy as np
import matplotlib.pyplot as plt

# load the image and its ground truth
image = io.imread('/content/msrc_objcategimagedatabase_v1/1_16_s.bmp')
ground_truth = io.imread('/content/msrc_objcategimagedatabase_v1/1_16_s_gt.bmp')

# show images side by side
fig, ax = plt.subplots(1, 2, figsize=(10, 5))
ax[0].imshow(image)
ax[0].set_title('image')
ax[1].imshow(ground_truth)
ax[1].set_title('ground truth')
plt.show()

3。使用 slic 生成超像素并创建区域邻接图

在应用 ncut 之前，我们使用 slic 算法来计算超像素。使用生成的超像素，我们基于平均颜色相似度构建区域邻接图（rag）：

from skimage.util import img_as_ubyte, img_as_float, img_as_uint, img_as_float64

compactness=30 
n_segments=100 
labels = segmentation.slic(image, compactness=compactness, n_segments=n_segments, enforce_connectivity=true)
image_with_boundaries = segmentation.mark_boundaries(image, labels, color=(0, 0, 0))
image_with_boundaries = img_as_ubyte(image_with_boundaries)
pixel_labels = color.label2rgb(labels, image_with_boundaries, kind='avg', bg_label=0

紧凑性控制形成超像素时像素的颜色相似度和空间接近度之间的平衡。它决定了对保持超像素紧凑（在空间方面更接近）与确保它们按颜色更均匀分组的重视程度。
较高的值：较高的紧凑度值会导致算法优先创建空间紧凑且大小均匀的超像素，而较少关注颜色相似性。这可能会导致超像素对边缘或颜色渐变不太敏感。
较低的值：较低的紧凑度值允许超像素在空间尺寸上变化更大，以便更准确地考虑颜色差异。这通常会导致超像素更紧密地遵循图像中对象的边界。

n_segments 控制 slic 算法尝试在图像中生成的超像素（或段）的数量。本质上，它设置了分割的分辨率。
较高的值：较高的 n_segments 值会创建更多的超像素，这意味着每个超像素会更小，分割会更细粒度。当图像具有复杂纹理或小物体时，这会很有用。
较低的值：较低的 n_segments 值会产生更少、更大的超像素。当您想要对图像进行粗分割，将较大的区域分组为单个超像素时，这非常有用。

4。应用标准化剪切 (ncut) 并可视化结果

# using the labels found with the superpixeled image
# compute the region adjacency graph using mean colors
g = graph.rag_mean_color(image, labels, mode='similarity')

# perform normalized graph cut on the region adjacency graph
labels2 = graph.cut_normalized(labels, g)
segmented_image = color.label2rgb(labels2, image, kind='avg')
f, axarr = plt.subplots(nrows=1, ncols=4, figsize=(25, 20))

axarr[0].imshow(image)
axarr[0].set_title("original")

#plot boundaries
axarr[1].imshow(image_with_boundaries)
axarr[1].set_title("superpixels boundaries")

#plot labels
axarr[2].imshow(pixel_labels)
axarr[2].set_title('superpixel labels')

#compute segmentation
axarr[3].imshow(segmented_image)
axarr[3].set_title('segmented image (normalized cut)')

5。评估指标
无监督分割的关键挑战是 ncut 不知道图像中类别的确切数量。 ncut 找到的分段数量可能超过实际的地面实况区域数量。因此，我们需要强大的指标来评估细分质量。

并集交集 (iou) 是一种广泛使用的评估分割任务的指标，特别是在计算机视觉领域。它测量预测分割区域和地面真实区域之间的重叠。具体来说，iou 计算预测分割和真实数据之间的重叠面积与其并集面积的比率。

结构相似性指数 (ssim) 是一种用于通过比较两个图像的亮度、对比度和结构来评估图像感知质量的指标。

立即学习“Python免费学习笔记（深入）”；

要应用这些指标，我们需要预测和地面实况图像具有相同的标签。为了计算标签，我们在地面上计算一个掩模，并在预测时为图像上找到的每种颜色分配一个 id
然而，使用 ncut 进行分割可能会发现比真实情况更多的区域，这会降低准确性。

def compute_mask(image):
  color_dict = {}

  # get the shape of the image
  height,width,_ = image.shape

  # create an empty array for labels
  labels = np.zeros((height,width),dtype=int)
  id=0
  # loop over each pixel
  for i in range(height):
      for j in range(width):
          # get the color of the pixel
          color = tuple(image[i,j])
          # check if it is in the dictionary
          if color in color_dict:
              # assign the label from the dictionary
              labels[i,j] = color_dict[color]
          else:
              color_dict[color]=id
              labels[i,j] = id
              id+=1

  return(labels)
def show_img(prediction, groundtruth):
  f, axarr = plt.subplots(nrows=1, ncols=2, figsize=(15, 10))

  axarr[0].imshow(groundtruth)
  axarr[0].set_title("groundtruth")
  axarr[1].imshow(prediction)
  axarr[1].set_title(f"prediction")
prediction_mask = compute_mask(segmented_image)
groundtruth_mask = compute_mask(ground_truth)

#usign the original image as baseline to convert from labels to color
prediction_img = color.label2rgb(prediction_mask, image, kind='avg', bg_label=0)
groundtruth_img = color.label2rgb(groundtruth_mask, image, kind='avg', bg_label=0)

show_img(prediction_img, groundtruth_img)

现在我们计算准确度分数

from sklearn.metrics import jaccard_score
from skimage.metrics import structural_similarity as ssim

ssim_score = ssim(prediction_img, groundtruth_img, channel_axis=2)
print(f"SSIM SCORE: {ssim_score}")

jac = jaccard_score(y_true=np.asarray(groundtruth_mask).flatten(),
                        y_pred=np.asarray(prediction_mask).flatten(),
                        average = None)

# compute mean IoU score across all classes
mean_iou = np.mean(jac)
print(f"Mean IoU: {mean_iou}")