I'm trying to use multithreading.pool to compare images based on similarity. While I have code working on a single core, using a
for loop or
map() to iterate over the data, it's dreadfully slow on large groups of images. For that reason I've been trying to implement multiprocessing but I can't seem to get it right. My main question is why doesn't
getssim() in the below code change the list?
The structure of the iterable looks something like this:
Where the float is the simiarlity index of an image compared to the current image being tested. Here is the (somewhat abbreviated) non-working code:
from skimage.measure import structural_similarity as ssim
simImgList =  #list of images ordered by their similarity
simImg = findSimilar(imagesdata)
similarityIndex = ssim(img1,imgd)
print(similarityIndex) #this prints correctly
imgd = similarityIndex
return imgd #this appears to have no effect
limg = imagesdata.pop()
global img1 #making img1 accessible to getssim, a bad idea!
img1 = limg
p = multiprocessing.Pool(processes=multiprocessing.cpu_count(),maxtasksperchild=2)
return limg #return name of image
images = [f for f in glob.glob(src + "*." + ftype)]
imagesdata = [[(f,cv2.imread(f,0)),""] for f in images]
finalList = makeSimilarList(imagesdata)
with open("./simlist.txt", 'w') as f:
Thanks for the help!!
You forgot to assign the result from
multiprocessing.map to a variable. The key function should probably read
def findSimilar(imagesdata): limg = imagesdata.pop() global img1 # making img1 accessible to getssim, a bad idea! img1 = limg p = multiprocessing.Pool(maxtasksperchild=2) imagesdata = p.map(getssim, imagesdata) p.close() p.join() imagesdata.sort(key=operator.itemgetter(1)) return limg #return name of image
Since you don't give enough details, I could not test your code, but I think this was the crucial point.