Make photomosaics, GIFs, and murals from pictures in Python with ML/OpenCV

(github.com)

126 points | by muzakthings 1954 days ago

7 comments

dfbrown 1954 days ago
I'm not an expert in the topic, but my understanding is RGB is a poor color space for computing color difference. This could be why your mosaics end up so washed out. [1] suggests using a CIELAB color space [2].
Edit: Looking at the code more closely it looks like you were using Lab at one point but commented it out[3], so I'm guessing you're already aware of this.
1: https://stackoverflow.com/a/9019461/185171
2: https://en.wikipedia.org/wiki/CIELAB_color_space#CIELAB
3: https://github.com/worldveil/photomosaic/blob/bb720efda11383...
[-]
- muzakthings 1954 days ago
  It didn’t make a ton of difference empirically when I tried it.
  But you’re correct, generally that’s the space you want to be in.
  [-]
  - gedy 1954 days ago
    It is quite noticeable when you are using a limited selection of tiles or image has desaturated colors.
    I've done some similar work[1], but issue with L*ab color is it's terrifically slow to calculate diff, at least in JS.
    [1] https://imgur.com/a/g3EzcSV
    [-]
    - muzakthings 1954 days ago
      Totally. Everything improves with more images since that’s your palette for painting, so to speak. The ones I posted were with less than 100 images so you can definitely do better.
      L*b was very slow, yes. It’s all done offline but I tend to like quicker feedback...
  - itronitron 1954 days ago
    there is no perfect way to measure color distances, mostly because it needs to account for human perception of color and there are individual differences in color perception among people.
fireattack 1954 days ago
Is the example image (https://github.com/worldveil/photomosaic/blob/master/media/r...) with or without opacity cheat?
[-]
- muzakthings 1954 days ago
  Both! You can experiment with —-best-k and —randomness <1.0 and sort of get things in the middle.
  Basically what this will do is assign each tile less than 100% of the time randomly and then for each that isn’t, you choose among the top best K matches on L2 distance with equal probability. Gives it a little bit of both.
  [-]
  - fireattack 1954 days ago
    I'm talking about `--opacity` - because to me the mosaic images used in the sky can't be that blue originally.
    [-]
    - muzakthings 1954 days ago
      Right. The opacity setting is superimposed after the tile assignment. Thus you can have both.
      I’ve found that opacity if 0.7 is often a nice compromise.
- PavlovsCat 1954 days ago
  Yeah that's using opacity, e.g. there's an image of a blue horizon that gets tinted red.
rmonroe 1954 days ago
Way better implementation of the face alignment than what I did for our peru trip. Good going ;-)
androidgirl 1954 days ago
The gif with facial recognition is actually really really cool. Awesome work
aaaaaaaaaab 1954 days ago
Ok, but where is the ML part?
We’ve been creating these mosaics for decades...
[-]
- muzakthings 1954 days ago
  The face montage building trains a linear classifier on top of the pretrained embedding network - it’s the portion that talks about creating a training folder of your face.
  But yes the photomosiacs strictly don’t use ML, unless you count the internal fun stuff Faiss (the similarity search lib) does to construct fast indexes.
- simple10 1954 days ago
  It's using the KMeans[1] library from sklearn.cluster. But this isn't really ML, is it? My ML knowledge is limited. Regardless, it's a cool project. OP might want to update the title to remove ML.
  [1] https://github.com/worldveil/photomosaic/blob/master/emosaic...
  [-]
  - simple10 1954 days ago
    "[Kmeans] algorithm has a loose relationship to the k-nearest neighbor classifier, a popular machine learning technique for classification that is often confused with k-means due to the name." Maybe an ML expert could elaborate? I've been curious on my own projects when to actually mention when they use true ML or not.
    https://en.wikipedia.org/wiki/K-means_clustering
    [-]
    - muzakthings 1954 days ago
      Right. See my comment above.
      As to why it was in the project: If you treat each pixel as an example vector in 3 dimensions and cluster, you get the “dominant” colors for the image. It’s a primitive way to compress images as well. In this case I just was using it to generat fun cards that would use a minimal number of dominant colors. It’s still in the code if you’d like to use it but a bit hidden.
    - _fullpint 1954 days ago
      K-nearest neighbor clusters based upon the k most similar objects.
      K-means, clusters on centroids that are means. After every interation new means are calculated and then reclustering occurs.
  - muzakthings 1954 days ago
    Naw I had used the Kmeans part for extracting dominant colors. I was thinking about using it to generate cards and using the top K colors for the background of the card.
    As I mentioned above the only ML is the face classifier. There’s a flag that allows you to only include face pics in the photomosaics as well
itronitron 1954 days ago
like others here, i really like the aligned face montage gif and it seems like it would be a great product for people to package up their selfies over a timeline
giladoved 1954 days ago
This is fantastic, great project!