README.md 10.2 KB
Newer Older
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
1
# Endoscopy Computer Vision Challenges on Segmentation and Detection [EndoCV2020](https://endocv.grand-challenge.org/)
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
2

3 4
Internal users: see the other version of this README inside [ead2020 note](https://gitlab.lrde.epita.fr/nboutry/ead2020/-/tree/master/notes)

 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
5
After a great success of Endoscopy Artefact Detection challenge (EAD2019), this year EndoCV2020 is introduced with two sub-challenge:
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
6 7 8 9

* Sub-challenge I:  Endoscopy Artefact Detection and Segmentation (EAD2020)
* Sub-challenge II: Endoscopy Disease Detection and Segmentation (EDD2020)

 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
10 11 12 13
Each sub-challenge is further divided into semantic segmentation and detection tasks

We participated in both sub-challenges, mainly on the semantic segmentation tasks. 
Our solution was summarized in 
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
14
["A U-Net++ With Pre-Trained EfficientNet Backbone for Segmentation of Diseases and Artifacts in Endoscopy Images and Videos"](http://ceur-ws.org/Vol-2595/endoCV2020_paper_id_11.pdf),
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
15 16
 submitted to the proceeding of EndoCV2020 competition (an ISBI challenge)
 
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
17 18 19 20
Abstract:

>Endoscopy is a widely used clinical procedure for the early detection of numerous diseases. However, the images produced are usually heavily corrupted with multiple artifacts that reduce the visualization of the underlying tissue. Moreover, the localization of actual diseased regions is also a complex problem. For that reason, EndoCV2020 challenges aim to make progress in the state-of-the-art in the detection and segmentation of artifacts and diseases in endoscopy images. In this work, we propose approaches based on U-Net and U-Net++ architecture to automate the segmentation task of EndoCV2020. We use the EfficientNet as our encoder to extract powerful features for our decoders. Data augmentation and pre-trained weights are employed to prevent overfilling and improve generalization. Test-time augmentation also helps in improving the results of our models. Our methods performs well in this challenge and achieves a score of 60.20\% for the EAD2020 semantic segmentation task and 59.81\% for the EDD2020’s.
 
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
21
## Content
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
22
This repository contains a solution to the [EndoCV2020](https://endocv.grand-challenge.org/) challenges. It allows the reproduction of the results that we submitted.
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
23
It contains extracts of two repositories (not visible outside LRDE yet!): 
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
24
* https://gitlab.lrde.epita.fr/dhuynh/segmentation-detection : a Keras lib that supports a large range of models based on the encoder-decoder architecture.
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
25
* https://gitlab.lrde.epita.fr/nboutry/ead2020 : Our internal collaborate repository for this challenge.
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
26

 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
27
## Dependencies
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
28
We used TensorFlow 1.14 as our backend for this work. This solution is not fully tested with TensorFlow 2.x or other versions. TensorFlow 1.14  is not listed as a dependency, so please install the CPU or GPU package depends on your system. (You may need python 3.7)
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
 
```
pip install tensorflow==1.14
```
or
```
pip install tensorflow-gpu==1.14
```

Then install the dependencies listed in requirements.txt

```
pip install -r requirements.txt
```


## Run on EndoCV test data
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
46
Assuming that the trained weights and the test data are available to you  ([internal users: check this note](https://gitlab.lrde.epita.fr/nboutry/ead2020/-/tree/master/notes)), these scripts produce the submissions to EndoCV challenges.
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
47

 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
All the inference scripts have the same arguments
```
usage: script_name.py [-h] [--datapath DATAPATH]
                      [--outputpath OUTPUTPATH]
                      weight

positional arguments:
  weight

optional arguments:
  -h, --help               show this help message and exit
  --datapath DATAPATH      The test data path
  --outputpath OUTPUTPATH  The test data path

```

To run with default setting, put all the test data inside a folder named **testsets**. 
The director looks like this:
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82

```
testsets/
├── EAD2020-Phase-II-Evaluation
│   ├── Detection
│   ├── Detection_sequence
│   ├── Generalization
│   └── SemanticSegmentation
├── EDD_evaluation_test-Final
```

From the root folder, run:
```
cd EAD2020-Phase-II-Evaluation
python ead_test_detection.py [WEIGHT_PATH]
python ead_test_segmentation.py [WEIGHT_PATH]
cd ../EDD_evaluation_test-Final
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
83 84
python edd_seg_dec_unetpp.py [WEIGHT_PATH]
python edd_seg_dec_unetpp_big.py [WEIGHT_PATH]
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106
```
The results will be in a folder named **RESULTS**:
```
RESULTS/
├── EAD2020-Phase-II-Evaluation
│   ├── ead_yolov3
│   │   ├── detection_bbox
│   │   ├── generalization_bbox
│   │   └── sequence_bbox
│   ├── edd_unetpp
│   │   └── semantic_masks
│   └── edd_unetpp_filled_filtered
│       └── semantic_masks
└── EDD_evaluation_test-Final
    ├── edd_unetpp
    │   ├── detection_bbox
    │   └── semantic_masks
    └── edd_unetpp_big
        ├── detection_bbox
        └── semantic_masks
```

 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
107 108
**ead_yolov3** and **edd_unetpp_filled_filtered** formed our best submission to EAD2020 
(**1st** in segmentation. see [ead2020 results, please sort by semantic segmentation score (sscore)](https://ead2020.grand-challenge.org/evaluation/results/))
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
109

 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
110 111 112
**edd_unetpp_big** is our best submission to EDD2020 (6th in segmentation [edd2020 results](https://edd2020.grand-challenge.org/evaluation/results/))
  
By default, these script will not use GPU. You have to modify this line in each script file to activate GPU usage:
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
113
```
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
114
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
115
```
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131
## Run on test data with Docker
There is a Dockerfile inside this repository. Build the docker image:
```
docker build -t aNiceName .
```
Prepare the folder that contains all test data that has the same structure as in the earlier section. Put all the trained weights inside the same folder. 

Then run a command like this to get all our networks' outputs insde a results folder
```
docker run --rm  --user="$(id -u):$(id -g)"\
--userns=host \
--volume="</path/to/test/data>:/usr/src/app/testsets" \
--volume="</path/to/weights>:/usr/src/app/weights" \
--volume="</path/to/where/you/want/the/results/to/be>:/usr/src/app/RESULTS" \
aNiceName /bin/sh run_test_sets.sh
```
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
132 133
  
## Training
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
134
### Getting the data
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
135
Please contact the EndoCV organizers and acquire all parts of the two train sets.
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
136 137 138 139 140 141 142 143 144 145 146

* The EAD2020 train set was released in 4 files:
    * ead2020_semantic_segmentation.zip
    * EAD2020_dataType_framesOnly.zip
    * EAD2020-Phase-II_dataType_sequence.zip
    * EAD2020-Phase-II-Detection_Segmentation-99Frames.zip


* The EDD2020 train set was released in 1 file:
    * EDD2020_release-I_2020-01-15_v2_s3.zip

147
Unzip these files inside a folder named **endocv_data** at the root of this repo. That director should look like this. 
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
148 149 150
```
endocv_data/
├── EAD2020_dataType_framesOnly
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
151 152
│   ├── frames
│   └── gt_bbox
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
153
├── EAD2020-Phase-II_dataType_sequence
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169
│   ├── Detection
│   │   ├── 1445-01
│   │   ├── 1445-01_bbox
│   │   ├── 1445-02
│   │   ├── 1445-02_bbox
│   │   ├── 1445-03
│   │   ├── 1445-03_bbox
│   │   ├── PYL-02
│   │   ├── PYL-02-bbox
│   │   ├── PYL-N-02
│   │   └── PYL-N-02_bbox
│   └── SemanticSegmentation
│       ├── 1445-01
│       ├── 1445-01_masks
│       ├── 1445-02
│       └── 1445-02_masks
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
170
├── EAD2020-Phase-II-Detection_Segmentation-99Frames
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
171 172 173
│   ├── bbox
│   ├── originalImages
│   └── semanticMasks
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
174
├── ead2020_semantic_segmentation
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
175 176
│   ├── images_ead2020
│   └── masks_ead2020
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
177
├── EDD2020_release-I_2020-01-15
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
178 179 180 181 182 183 184 185 186
│   ├── bbox
│   ├── masks
│   ├── masksPerClass
│   │   ├── BE
│   │   ├── cancer
│   │   ├── HGD
│   │   ├── polyp
│   │   └── suspicious
│   └── originalImages
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
187
```
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
188
  
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
189 190
### Pre-process the data
Run the following scripts to pre-process the train sets.
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
191 192 193 194 195
In short, these scripts organize the training data in several panda dataframes and mark or modify some annotation.

* Modification to bounding boxes ground-truth:
    * Remove boxes that have 0 area 
    * Move boxes that goes outside image domain back inside.
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
196
* Mark segmentation ground-truth:
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
197
    * A list of class id whose mask has a one-pixel-height line at the bottom. This marking will be handle by data loader
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
198 199 200 201 202 203

```
python processing_codes/processing_EAD_dec.py
python processing_codes/processing_EAD_seg.py
python processing_codes/processing_EDD.py
``` 
204 205
### Get yolov3 pre-trained weights
If you want to retrain the artifact the detectors, you can get the YOLOv3 pre-trained weight from its authors and convert it for this implementation: 
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
206
```
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
207
mkdir weights
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224
cd processing_codes
bash get_yolo_weights.sh
```

### Train the models
Training scripts for models presented in "A U-Net++ With Pre-Trained EfficientNet Backbone
 for Segmentation of Diseases and Artifacts in Endoscopy Images and Videos" are in **train_codes**.
 
 
| Files             | Model       | Dataset |
|-------------------|-------------|---------|
| ead_unetpp.py     | Unet++B1    | EAD2020 |
| ead_yolov3.py     | YOLOv3      | EAD2020 |
| edd_unetpp.py     | Unet++B1    | EDD2020 |
| edd_unetpp_big.py | Unet++512B1 | EDD2020 |

Just run these scripts with 
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
225

 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
226 227 228
```
python train_codes/script_name.py
```
 Huỳnh Lê Duy's avatar
Huỳnh Lê Duy committed
229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257


## Citation
If you use this code for your research, please cite our [EndoCV2020 challenge paper](http://ceur-ws.org/Vol-2595/endoCV2020_paper_id_11.pdf):


```
@inproceedings{DBLP:conf/isbi/HuynhB20,
  author    = {L{\^{e}} Duy Huynh and
               Nicolas Boutry},
  editor    = {Sharib Ali and
               Christian Daul and
               Jens Rittscher and
               Danail Stoyanov and
               Enrico Grisan},
  title     = {A U-Net++ With Pre-Trained EfficientNet Backbone for Segmentation
               of Diseases and Artifacts in Endoscopy Images and Videos},
  booktitle = {Proceedings of the 2nd International Workshop and Challenge on Computer
               Vision in Endoscopy, EndoCV@ISBI 2020, Iowa City, Iowa, USA, 3rd April
               2020},
  series    = {{CEUR} Workshop Proceedings},
  volume    = {2595},
  pages     = {13--17},
  publisher = {CEUR-WS.org},
  year      = {2020},
  url       = {http://ceur-ws.org/Vol-2595/endoCV2020\_paper\_id\_11.pdf}
}

```