Import
Import Your Dataset to Waffle Dataset¶
You may have your own dataset such as coco, yolo, huggingface(transformers). You can simply import your dataset to Waffle Dataset using from_{format}
function.
We provide the sample dataset for this tutorial. Follow the below steps!
COCO Format¶
In [1]:
Copied!
!wget https://github.com/snuailab/assets/raw/main/waffle/sample_dataset/mnist.zip
!unzip mnist.zip -d coco
!wget https://github.com/snuailab/assets/raw/main/waffle/sample_dataset/mnist.zip
!unzip mnist.zip -d coco
--2023-06-26 13:23:48-- https://github.com/snuailab/assets/raw/main/waffle/sample_dataset/mnist.zip Resolving github.com (github.com)... 20.200.245.247 Connecting to github.com (github.com)|20.200.245.247|:443... connected. HTTP request sent, awaiting response... 302 Found Location: https://raw.githubusercontent.com/snuailab/assets/main/waffle/sample_dataset/mnist.zip [following] --2023-06-26 13:23:48-- https://raw.githubusercontent.com/snuailab/assets/main/waffle/sample_dataset/mnist.zip Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.108.133, 185.199.109.133, ... Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 157823 (154K) [application/zip] Saving to: ‘mnist.zip’ mnist.zip 100%[===================>] 154.12K --.-KB/s in 0.02s 2023-06-26 13:23:49 (8.20 MB/s) - ‘mnist.zip’ saved [157823/157823] Archive: mnist.zip inflating: coco/coco.json creating: coco/images/ inflating: coco/images/1.png inflating: coco/images/10.png inflating: coco/images/100.png inflating: coco/images/11.png inflating: coco/images/12.png inflating: coco/images/13.png inflating: coco/images/14.png inflating: coco/images/15.png inflating: coco/images/16.png inflating: coco/images/17.png inflating: coco/images/18.png inflating: coco/images/19.png inflating: coco/images/2.png inflating: coco/images/20.png inflating: coco/images/21.png inflating: coco/images/22.png inflating: coco/images/23.png inflating: coco/images/24.png inflating: coco/images/25.png inflating: coco/images/26.png inflating: coco/images/27.png inflating: coco/images/28.png inflating: coco/images/29.png inflating: coco/images/3.png inflating: coco/images/30.png inflating: coco/images/31.png inflating: coco/images/32.png inflating: coco/images/33.png inflating: coco/images/34.png inflating: coco/images/35.png inflating: coco/images/36.png inflating: coco/images/37.png inflating: coco/images/38.png inflating: coco/images/39.png inflating: coco/images/4.png inflating: coco/images/40.png inflating: coco/images/41.png inflating: coco/images/42.png inflating: coco/images/43.png inflating: coco/images/44.png inflating: coco/images/45.png inflating: coco/images/46.png inflating: coco/images/47.png inflating: coco/images/48.png inflating: coco/images/49.png inflating: coco/images/5.png inflating: coco/images/50.png inflating: coco/images/51.png inflating: coco/images/52.png inflating: coco/images/53.png inflating: coco/images/54.png inflating: coco/images/55.png inflating: coco/images/56.png inflating: coco/images/57.png inflating: coco/images/58.png inflating: coco/images/59.png inflating: coco/images/6.png inflating: coco/images/60.png inflating: coco/images/61.png inflating: coco/images/62.png inflating: coco/images/63.png inflating: coco/images/64.png inflating: coco/images/65.png inflating: coco/images/66.png inflating: coco/images/67.png inflating: coco/images/68.png inflating: coco/images/69.png inflating: coco/images/7.png inflating: coco/images/70.png inflating: coco/images/71.png inflating: coco/images/72.png inflating: coco/images/73.png inflating: coco/images/74.png inflating: coco/images/75.png inflating: coco/images/76.png inflating: coco/images/77.png inflating: coco/images/78.png inflating: coco/images/79.png inflating: coco/images/8.png inflating: coco/images/80.png inflating: coco/images/81.png inflating: coco/images/82.png inflating: coco/images/83.png inflating: coco/images/84.png inflating: coco/images/85.png inflating: coco/images/86.png inflating: coco/images/87.png inflating: coco/images/88.png inflating: coco/images/89.png inflating: coco/images/9.png inflating: coco/images/90.png inflating: coco/images/91.png inflating: coco/images/92.png inflating: coco/images/93.png inflating: coco/images/94.png inflating: coco/images/95.png inflating: coco/images/96.png inflating: coco/images/97.png inflating: coco/images/98.png inflating: coco/images/99.png inflating: coco/test.json inflating: coco/train.json inflating: coco/val.json
In [ ]:
Copied!
from waffle_hub.dataset import Dataset
Dataset.from_coco(
name="mnist_coco",
task="object_detection",
coco_file="coco/coco.json",
coco_root_dir="coco/images",
)
from waffle_hub.dataset import Dataset
Dataset.from_coco(
name="mnist_coco",
task="object_detection",
coco_file="coco/coco.json",
coco_root_dir="coco/images",
)
loading annotations into memory... Done (t=0.00s) creating index... index created!
1it [00:00, 52.54it/s]: 0%| | 0/100 [00:00<?, ?it/s] Importing coco dataset: 100%|██████████| 100/100 [00:00<00:00, 5002.57it/s]
DatasetInfo(name='mnist_coco', task='OBJECT_DETECTION', created='2023-06-26 13:25:26')
Transformers (Huggingface) Format¶
In [5]:
Copied!
!wget https://github.com/snuailab/assets/raw/main/waffle/sample_dataset/mnist_huggingface_detection.zip
!unzip mnist_huggingface_detection.zip -d huggingface
!wget https://github.com/snuailab/assets/raw/main/waffle/sample_dataset/mnist_huggingface_detection.zip
!unzip mnist_huggingface_detection.zip -d huggingface
--2023-06-26 13:29:44-- https://github.com/snuailab/assets/raw/main/waffle/sample_dataset/mnist_huggingface_detection.zip Resolving github.com (github.com)... 20.200.245.247 Connecting to github.com (github.com)|20.200.245.247|:443... connected. HTTP request sent, awaiting response... 302 Found Location: https://raw.githubusercontent.com/snuailab/assets/main/waffle/sample_dataset/mnist_huggingface_detection.zip [following] --2023-06-26 13:29:44-- https://raw.githubusercontent.com/snuailab/assets/main/waffle/sample_dataset/mnist_huggingface_detection.zip Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.109.133, 185.199.108.133, ... Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 226268 (221K) [application/zip] Saving to: ‘mnist_huggingface_detection.zip’ mnist_huggingface_d 100%[===================>] 220.96K --.-KB/s in 0.03s 2023-06-26 13:29:45 (7.64 MB/s) - ‘mnist_huggingface_detection.zip’ saved [226268/226268] Archive: mnist_huggingface_detection.zip inflating: huggingface/dataset_dict.json creating: huggingface/test/ inflating: huggingface/test/data-00000-of-00001.arrow inflating: huggingface/test/dataset_info.json inflating: huggingface/test/state.json creating: huggingface/train/ inflating: huggingface/train/data-00000-of-00001.arrow inflating: huggingface/train/dataset_info.json inflating: huggingface/train/state.json creating: huggingface/val/ inflating: huggingface/val/data-00000-of-00001.arrow inflating: huggingface/val/dataset_info.json inflating: huggingface/val/state.json
In [8]:
Copied!
from waffle_hub.dataset import Dataset
Dataset.from_transformers(
name="mnist_transformers",
task="object_detection",
dataset_dir="huggingface"
)
from waffle_hub.dataset import Dataset
Dataset.from_transformers(
name="mnist_transformers",
task="object_detection",
dataset_dir="huggingface"
)
Out[8]:
DatasetInfo(name='mnist_transformers', task='OBJECT_DETECTION', created='2023-06-26 13:30:30')