Determining accuracy of segmentation versus ground truth disease patches of Blackleg on Turnip

My research involves the classification of a leaf spots on turnip derived from the pathogen blackleg. I had hypothesized that spatial patterns of pixels based on image classification are related to manually classified spatial patterns of observed disease on turnip leaves because disease has a characteristic spectral signature of infection on the leaves. Here I focus on the accuracy of previously determined disease patches through segmentation and ground truth classification. To do this a confusion matrix is used and allows for detection of true positives, true negatives, false positives and false negatives. All of the image processing took place in ArcGIS Pro. The next step involved the accuracy assessment which was conducted in R.

How accurate is the computer image classification?
• Can the accuracy be quantified?
• How can the image classification error be visually represented?

Tools and Methodology
I began in ArcGIS Pro to help visually represent the false negative and false positive regions of the disease patches through the segmentation process of classification. To start I turned on both layers where you can see the overlap between the two classification methods in Image 1.

I then went to the symbology of the segmented image and changed the color to white. I placed this layer on top, so it covered all the raster cells of the manually classified patches leaving only the false negatives, seen in Image 2. Next, I went back into the symbology, but for the manually classified image and changed the color scheme to white. I moved this layer to the top and changed the segmented image color scheme back to unique values. I was left with Image 3 showing the false positives. This was an easy way to visualize these disease patches and how well the classification method was working.

Next, I exported a Tiff file of both the manually classified patches and the segmented patches. To ensure each cell between the two layers lined up in R I had to make sure the extent was the same for both layers when I exported. To do this I right clicked on the layer and went down to data and selected export raster. A pane appears in right side of ArcGIS Pro where I hit the drop-down arrow for clipping geometry and selected current display. I did this with both layers and had one Tiff file for computer classified image through segmentation and one for the manually classified disease patches.

Using the raster, rgdal and sp packages I was able to upload my two Tiff files in raster format to R. I gave the two files each a name and used the plot function to view the two images. I noticed they both had values associated with each patch which were on a gradient scale. To correct for this, I converted my two raster layers in R to tables. This provided a coordinate for each cell and the value associated with it. In the image segmentation raster table I had 0 to 2 and the manually classified image I had 1 to 6. For all the white space I was given ‘N/A’, which was another issue. I used the xlsx package to export my data tables to excel files. I opened the two files in excel and used the sort smallest to largest function. From here I was able to use the replace function and change all the ‘N/A’ values to 0’s and all the values associated with pixels to 1’s. The values were arbitrary associated with the pixels and I needed the raster in 1’s and 0’s format. After doing this with both excel sheets I copy and pasted the two side by side and deleted all the associated coordinates. These were also unnecessary because each pixel from the same coordinate between the layers were in the same row. I saved this excel file and uploaded it into R. I downloaded the caret package and performed a confusion matrix which can be seen below in Table 1.

The visual representation for my false positive and false negative results can be seen below in Image 2 & 3 with Image 1 for comparison. You can see the false negatives for disease covers a much larger area than does the false positives. This may imply that the segmentation is limited in its assessment of diseased area. What it tends to miss is the margins of the disease but does a fair job of predicting the center of the disease where it likely originated and is most severe. To correct this, setting a larger threshold may allow for less severe regions of the disease to be classified. Because the segmentation is based on pixel reflectance value at the red band, this would mean the threshold value needs to be slightly lowered.

Additionally, an entire patch of disease was missed which can be seen in the right corner of Image 1. Currently, the classification system is set to only create segmented patches of 10 or more pixels. This patch is 9 pixels and therefore just missed the cutoff. Even though it was just shy of this requirement, we are unsure if the segmentation would have detected a difference in this diseased patch or if it was also out of the threshold for classification. If it is common for disease patches to be this small, it may be an indicator to lower the value to 5 or 6 for what it is allowed in the segmentation blocking.

There are multiple steps in the processing of the image where different routes could have been taken and potentially increased classification accuracy. The objective of this classification method is to have as little percent error as possible or at least determine if you’d rather have more false positive than false negatives, vice versa or equal. Here we have greater percent of cells which are false negatives and modest assessment of diseased pixels when simply visualizing the images.

To help quantify these images and determine how accurate the model was, a confusion matrix was used in R, seen in Table 1. The segmented classification correctly identified non-diseased regions very well and did a pretty good overall job of predicting disease that was confirmed with ground truth. There were 2075 true positives, 55 false negatives, 41 true negatives, and 12 false positives. The model correctly identified 97.1% of the 2183 total cells. The precision of the model was 42.7%. The sensitivity of the model was 77.4% and the specificity was 97.4%. The accuracy was very good for this model and is essentially a percent error calculation of the model. The sensitivity measures the proportion of actual positives that are correctly identified while the specificity is the opposite and measures the proportion of true negatives identified. The precision gives us a sense of how useful or complete the results are. The model did well overall but provided insights to possible adjustments that could be made which would increase the predictive power here.

Critique of method
One critique I have is the small sample size. While I simply intended to only lay down the framework for creating a stepwise process in disease classification, it supplies results that can hardly be statistically backed. I would like to increase my sample size to five images for part three and look for similarities and differences between the five. I also intend to make some adjustments to the process to try and increase overall accuracy, precision, specificity and sensitivity. So essentially this critique is the limited conclusion that can be drawn from a sample size of one and the need to increase that to five for now.

The second critique I have is the number of steps I have used to get to this point. I would like to find a more manageable way to do the segmentation process and the image processing steps to get to this point. I have found small changes I can make along the way. Ideally, I can use the Modelbuilder in ArcGIS Pro where most of the processing is done. This will streamline the process when I find a way to do this.

An error which is present in my results is the confusion matrix. The matrix is considering 2183 raster cells in order to perform the matrix. These cells were determined by a defined rectangle when exporting a Tiff file from ArcGIS Pro. Many of these cells are not even of the leaf and is simply classifying regions outside of the leaf. To correct this, I would need to export a Tiff file which is symbolic of the leaf shape. The confusion matrix results provided were therefore erroneous in a sense or a bit misleading.

Partner ideas
My partner talked about how they were doing a neighborhood analysis and it could be practical for me to do. She had mentioned doing it in earth engine which I haven’t used but could get some help from her. There is a multiple rings buffer and I could look at the false positives in this light. She also mentioned using geographically weighted regression. We didn’t discuss much about it, but it seemed like a good regression to perform on my error analysis. We related on some level with our projects data and issues but at the time didn’t have any clear resolves. I will be curious to follow up on our chat and see what type of analysis was performed and share my results as well.


Image 1. Overlap between segmentation on top and manual classification below

Image 2. False negatives after subtracting segmented regions from manually classified.

Image 3. False positives after manually classified cells are subtracted from segmented.

Table 1. Confusion matrix

R Code
##raster upload



img20_seg <- raster(“E:/Exercise1_Geog566/MyProject3/RasterT_afr7_Polygon_1.tif”)

img20_ground <- raster(“E:/Exercise1_Geog566/MyProject3/diseased_20_PolygonT_1.tif”)



##export data

raster.table <-, xy=T)
truth <-, xy=T)


tableimg <- raster.table

write.table(tableimg, file = “dataexport.csv”, sep = “,”)
write.table(truth, file = “truth1.csv”, sep = “,”)

##confusion matrix

#######this isn’t working


table(confusionM$reference, confusionM$predicted)

confusionMatrix(confusionM$reference, confusionM$predicted)

##For the confusion matrix if above doesn’t work

myconfusionM <- table(confusionM$predicted, confusionM$reference)

##accuracy of matrix

2075+41+12+55 #total
(12+55)/2183 #misclassified/total







Print Friendly, PDF & Email

1 thought on “Determining accuracy of segmentation versus ground truth disease patches of Blackleg on Turnip

  1. jonesju

    Taylor, this analysis came out very well. The maps of false positives and false negatives clearly demonstrate how the classification algorithm is functioning, and enable you to make adjustments to improve the classifier. Well done!

Comments are closed.