ImageSynchronizer

簡介

ImageSynchronizer 是一個影像資料管理套件.

為了漂亮的顯示pandas DataFrame,強烈建議使用 Jupyter Notebook 執行後續的範例

套件安裝

  1. 安裝 Microsft ODBC driver
    Linux
    Windows

  2. 透過此 連結下載套件

  3. 透過指令安裝 :

    Windows : python setup.py install

    Linux : sudo python3 setup.py install

ImageSynchronizer 的運作方式


資料集的命名規則是什麼?

Format :

A_B_C_D


  • A : 資料集

    • Original : 自行拍攝的影像資料(原創資料集)
    • Opensource : 網路上下載的開源影像資料(開源資料集) </font>

  • B : Original資料集的拍攝日期或Opensource資料集的名稱

    • 自行拍攝的影像資料 : 拍攝日期
    • 開源資料集 : 資料集的名稱. </font>

  • C : 影像種類

    • 影像種類包含以下四種 :
    • rgb, dvs, fisheye, dewarp </font>

  • D : 標記方式

    1. bbox : Normal bounding box
    2. seg : Semantic segmentation
    3. clf : Classification
    4. rbbox : Rotated bounding box (bounding box with angle)
    5. poly : Polygon
    6. reg : Regression

</font>
※ 每種Label格式都有固定的Json存檔格式


使用方式

1. 連線至SQL資料庫和影像倉庫

In [1]:
from ImageSynchronizer import ImageSynchronizer
sync = ImageSynchronizer.synchronizer()
Connecting to Azure database....
Connecting to data storage...

2. 資料集內容

連線到SQL資料庫後, 使用datasetsInfo 屬性可以列出所有資料集。

  • datasetID : 資料集名稱
  • version : 標記方式的版本,例如版本 1.0.0 的 bounding box 只包含 'person' 的類別, 但經過討論後,新的標記變成了 'person with face' 和 'person without face' 兩種類別,而標記版本則進版為 2.0.0.
  • image_type : 本資料集包含了什麼樣類型的影像資料
In [2]:
# 列出所有資料集的資訊
sync.datasetsInfo
Out[2]:
datasetID label_version image_type classes projects label_type size note
0 opensource_habbof_rgb_rbbox 2.0.0 fisheye360 person wework rbbox 5837
1 opensource_handraised_rgb_bbox 1.0.0 rgb person sbr bbox 3968 people detection dataset with hand raised person
2 opensource_imdbwiki&affectnet_rgb_clf&reg 1.0.0 rgb gender,emotion sbr,bestbuy clf,reg 0 age, gender, emotion dataset
3 opensource_maskedface_rgb_bbox 1.0.0 rgb face sbr bbox 1526 face detction dataset with masked faces
4 opensource_widerface_rgb_bbox 1.0.0 rgb face sbr bbox 8036 opensource face detection dataset - wider face
5 original_20200114_rgb_bbox 1.0.0 rgb face sbr bbox 718
6 original_20200120_rgb_bbox 1.0.0 rgb face sbr bbox 6462
7 original_20200121_rgb_bbox 1.0.0 rgb face sbr bbox 6434
8 original_20200207_fisheye_rbbox 2.0.0 fisheye360 person wework rbbox 161 old ulsee 3F office - working area, known as W...
9 original_20200219_dewarp_bbox 2.0.0 dewarp person sbr bbox 8125
10 original_20200219_fisheye_rbbox 2.0.0 fisheye360 person wework rbbox 62 (Under inspection)know as Ceiling_OwnRecord_Pa...
11 original_20200219_rgb_bbox 1.0.0 rgb face sbr bbox 279
12 original_20200227_dewarp_bbox 2.0.0 dewarp person sbr bbox 5502
13 original_20200227_fisheye_rbbox 2.0.0 fisheye360 person wework rbbox 325 (Under inspection)know as Ceiling_OwnRecord_Pa...
14 original_20200227_rgb_bbox 1.0.0 rgb face sbr bbox 4345
15 original_20200316_dewarp_bbox 2.0.0 dewarp person sbr bbox 3129 know as Pedstrian_OwnRecord_20200316
16 original_20200316_fisheye_rbbox 2.0.0 fisheye360 person wework rbbox 333 know as Ceiling_OwnRecord_Party3
17 original_20200324_rgb_bbox 1.0.0 rgb face sbr bbox 298
18 original_20200325_fisheye_rbbox 2.0.0 fisheye360 person wework rbbox 210 know as Ceiling_OwnRecord_Party4
19 original_20200327_fisheye_rbbox 2.0.0 fisheye360 person wework rbbox 835 ulsee 9F office - relaxing area
20 original_20200406_dewarp_bbox 2.0.0 dewarp person sbr bbox 1224 know as FromSBR_20200406_dewarped_Labeledv2
21 original_20200414_fisheye_rbbox 2.0.0 fisheye360 person wework rbbox 1324 ulsee 9F office - working area

3. 訂閱特定的資料集

In [3]:
# 定義要訂閱資料集的條件,並得到對應的 dataset ID
datasets = sync.datasetsInfo[sync.datasetsInfo['image_type']=='fisheye360']['datasetID']
# 僅訂閱 Original的資料集
datasets = [i for i in datasets if 'opensource' not in i]
# 發送訂閱請求
sync.subscribe(datasets=datasets)
In [4]:
# 顯示您鎖訂閱的資料集資訊
sync.datasetsInfo_subscribed
Out[4]:
datasetID label_version image_type classes projects label_type size note
8 original_20200207_fisheye_rbbox 2.0.0 fisheye360 person wework rbbox 161 old ulsee 3F office - working area, known as W...
10 original_20200219_fisheye_rbbox 2.0.0 fisheye360 person wework rbbox 62 (Under inspection)know as Ceiling_OwnRecord_Pa...
13 original_20200227_fisheye_rbbox 2.0.0 fisheye360 person wework rbbox 325 (Under inspection)know as Ceiling_OwnRecord_Pa...
16 original_20200316_fisheye_rbbox 2.0.0 fisheye360 person wework rbbox 333 know as Ceiling_OwnRecord_Party3
18 original_20200325_fisheye_rbbox 2.0.0 fisheye360 person wework rbbox 210 know as Ceiling_OwnRecord_Party4
19 original_20200327_fisheye_rbbox 2.0.0 fisheye360 person wework rbbox 835 ulsee 9F office - relaxing area
21 original_20200414_fisheye_rbbox 2.0.0 fisheye360 person wework rbbox 1324 ulsee 9F office - working area

4. Sync dataset to target folder

In [5]:
# 發出訂閱請求後,可以將訂閱的資料集同步至指定資料夾
sync.update_subscription(target_path = "./test")
# 這個步驟可能花費數小時,取決於訂閱的資料集大小
# Note : 這個 function 只會同步您指定資料夾內所沒有的圖片,已有的圖片不會同步至目標資料夾

# 如果資料集過大,您也可以指定需要同步的影像數量以加快同步速度
sync.update_subscription(target_path = "./test", amount = 50)
Syncing files : original_20200414_fisheye_rbbox - 2648/2648

Synchronization complete!

5. 使用客製的 draw_box function 預覽圖片和 Bounding Box

In [6]:
from ImageSynchronizer.utils import draw_box
import glob
import numpy as np
import matplotlib.pyplot as plt

anno_list = glob.glob(r"test/**/*.json")
rdn_anno = np.random.choice(anno_list,1)[0]
img = draw_box(rdn_anno)
plt.figure(figsize = (10,10))
plt.imshow(img[...,::-1])
Out[6]:
<matplotlib.image.AxesImage at 0x20fa7a31e88>

Annotation 標記轉換

ImageSynchronizer 對於不同的標記方式有自訂的 annotation 格式並存成Json檔。

使用者可以賺換 ImageSynchronizer json annotation 至不同的 annotation 格式 :

  • labelme 的 json 格式:

    原生的 ImageSynchronizer annotaion json 檔案會被移動到資料夾 ImageSynchronizer_annotation 內, 且新生成的 labelme annotaion 會取代原始位置的 ImageSynchronizer annotaion

    • rbbox 的標記方式若轉換成 labelme_json 格式會被存為為 polygon 的 Annotation。
    • bbox 的標記方式若轉換成 labelme_json 格式會被存為為 rectangle 的 Annotation。
  • roLabelImg 的 XML 格式:

    轉換後的 xml annotation file 會直接生成在與原生的 ImageSynchronizer annotaion json 相同的資料夾底下

In [7]:
# 批量轉換 ImageSynchronizer_json 至 rolabelimg_xml 的格式
from ImageSynchronizer.parse_annotation import parse_annotation
from IPython.display import display
import glob

anno_dir = "./test/original_20200207_fisheye_rbbox/*.json"
anno_list = glob.glob(anno_dir)
rdn_anno = np.random.choice(anno_list,1)[0]
parser = parse_annotation(rdn_anno)
display(parser.objects_df) # 使用 Dataframe 顯示所有 Bounding Box
parser.convert('rolabelimg_xml')
class x y w h a
0 person 325.0606 1830.0276 234.9766 124.9476 2.805079
1 person 2027.1913 2194.8101 236.2877 180.1126 0.909657
2 person 1920.7086 2347.9131 190.8622 129.6960 1.083855
3 person 1818.9238 2472.0998 182.9408 152.0798 1.218936
4 person 1601.1981 2391.9068 230.9878 154.6849 1.403045
5 person 496.2069 2016.4414 227.9669 104.8132 2.593291
6 person 531.5791 1802.1040 166.5165 172.8376 2.762287
7 person 86.3188 1310.6348 40.6348 27.5578 0.095276
8 person 852.6728 457.3658 121.3943 135.1193 1.032065
9 person 1384.6671 279.3909 285.8087 98.9037 1.523157
10 person 2427.6921 674.4129 70.2182 95.8702 2.482203
11 person 2472.9676 1468.6820 161.8800 151.8804 0.027760
12 person 2574.4495 1335.5937 141.7466 103.9082 3.049819
13 person 2497.8355 925.7681 120.4263 136.4198 2.689113
14 person 56.5960 1206.2769 37.6174 27.2539 0.167367
15 person 83.6471 1181.4133 53.0507 33.6599 0.188388
16 person 78.3192 1144.6316 43.8087 29.5236 0.213605
17 person 85.7717 1105.4173 64.5504 47.9934 0.242215
18 person 120.6037 1041.8090 69.5771 48.0813 0.293105
19 person 171.0714 930.0786 46.9209 43.0253 0.382102
20 person 198.1711 879.6247 82.3433 54.2486 0.423893
21 person 407.9223 694.9866 68.2060 69.3103 0.625246
22 person 125.7133 1292.4381 76.5121 37.9849 0.111807
Convert original_20200207_fisheye_rbbox_00079.json complete!