ImageKompressor

Overview

ImageKompressor uses the K-Means clustering algorithm to compress an image by identifying certain key RGB values and using those exclusively, rather than allowing any arbitrary RGB value. The K-Means clustering algorithm does this using the following steps iteratively:

  • Assign Centroids: For each pixel in the image, the algorithm finds the centroid closest to its pixel value by Euclidean distance.
  • Update Centroids: New centroids are generated by taking the average of all pixels assigned to a given centroid and updating that centroid to the new average.

After iterating this a certain number of times, we achieve centroids which accurately capture clusters in the data. For image compression, this means that we get a certain colors (each centroid represents a color) which capture much of the color data for the picture. Each pixel, rather than getting an RGB value, simply contains a centroid index representing which color to use.

Example

Original image

Original image

Compressed images

20 centroids 50 centroids 100 centroids
25 iterations Compressed 20/25 Compressed 50/25 Compressed 100/25
50 iterations Compressed 20/50 Compressed 50/50 Compressed 100/50
100 iterations Compressed 20/100 Compressed 50/100 Compressed 100/100

Technical details

The image compression is done using a combination of Python (using NumPy) and OpenCL to use to GPU to achieve greater concurrency where appropriate. The assignment stage is done via an OpenCL kernel which finds the closest centroid for each pixel concurrently. Then, the update happens via basic NumPy operations, and so on.

Project link: https://github.com/robcarney/ImageKompressor

Nifty tech tag lists fromĀ Wouter Beeftink