画像を4 KiBプレビューに圧縮する

この課題では、画像プレビュー圧縮アルゴリズムを作成します。目標は、任意の画像ファイルを4 KiBプレビュー画像に縮小することです。これを使用して、非常に少ない帯域幅で画像をすばやく識別できます。

圧縮プログラムと解凍プログラムの2つのプログラム（または1つの結合プログラム）を作成する必要があります。どちらも入力としてファイルまたは標準入力を取り、ファイルまたは標準出力に出力する必要があります。コンプレッサーは、主流の可逆画像形式（PNG、BMP、PPMなど）の1つの画像を受け入れ、最大4096バイトのファイルを出力する必要があります。解凍プログラムは、圧縮プログラムによって生成されたファイルをすべて受け入れ、入力にできるだけ近い画像を出力する必要があります。エンコーダー/デコーダーにはソースコードサイズの制限がないため、アルゴリズムを工夫することができます。

制限事項：

「不正行為」はありません。プログラムは隠し入力を使用したり、インターネットにデータを保存したりすることはできません。また、スコアリング画像のセットのみに関連する機能/データを含めることも禁じられています。
ライブラリ/ツール/ビルトインの場合、一般的な画像処理操作（スケーリング、ぼかし、色空間変換など）を使用できますが、画像のデコード/エンコード/圧縮操作はできません（コンプレッサー入力およびデコンプレッサー出力を除く）。また、一般的な圧縮/解凍も許可されていません。この課題のために独自の圧縮を実装することを意図しています。
圧縮解除プログラムによる画像出力のサイズは、圧縮プログラムに指定された元のファイルのサイズと正確に一致する必要があります。画像の大きさがどちらの方向でも2 ¹⁶を超えないと仮定できます。
コンプレッサーは、平均的な消費者向けPCで5分以内に実行する必要があり、デコンプレッサーは、以下のセットのすべての画像に対して10秒以内に実行する必要があります。

得点

迅速な検証と視覚的な比較を支援するために、回答を使用して圧縮した後のテストコーパスのロスレスイメージアルバムを含めてください。

コンプレッサーは、次の画像コーパスを使用してテストされます。

こちらから zipファイルのすべての画像をダウンロードできます。

スコアは、すべての画像でのコンプレッサーの平均構造類似性インデックスになります。dssimこの課題にはオープンソースを使用します。ソースから簡単に構築できます。Ubuntuを使用している場合はPPAもあります。独自の回答を採点することをお勧めしますが、Cアプリケーションの構築方法がわからず、Debian / Ubuntuを実行していない場合は、他の人に採点を許可することができます。dssimPNGでの入出力を想定しているため、別の形式で出力する場合は、最初に出力をPNGに変換します。

スコアリングを簡単にするために、以下に簡単なPythonスクリプトの使用方法を示しますpython score.py corpus_dir compressed_dir。

import glob, sys, os, subprocess

scores = []
for img in sorted(os.listdir(sys.argv[1])):
    ref, preview = (os.path.join(sys.argv[i], img) for i in (1, 2))
    sys.stdout.write("Comparing {} to {}... ".format(ref, preview))
    out = subprocess.check_output(["dssim", ref, preview]).decode("utf-8").split()[0]
    print(out)
    scores.append(float(out))

print("Average score: {:.6f}".format(sum(scores) / len(scores)))

最低スコアが勝ちます。

code-challenge image-processing compression

— orlp
ソース

圧縮された画像は表示可能でなければなりませんか？

— ユーメル

@Eumel解凍プログラムをビューアと見なすことができます。いいえ、圧縮形式は任意であり、完全にあなた次第です。解凍後のみ、表示可能な画像が表示されます。

— -orlp

You may assume that the image dimensions do not exceed 2^32 in either direction.これは少し過剰ではありませんか？これは、（x、y）座標のペアを保存するために最大16バイトを使用する必要があることを意味します。いずれの方向にも2 ^ 16（65536）ピクセルを超えるサイズの画像ファイルはほとんどなく、コーパス内のすべての画像に対して2 ^ 11で十分です。

— ピーターオルソン

@PeterOlsonに変更し2^16ます。

— -orlp

@orlpルールは、解凍プログラムが画像をデコードするのに10秒未満しかかからないことを定めています。私の考えでは、参照ファイルを生成するのに数分かかる可能性があり、これはその後の解凍プログラムの呼び出しで使用されます。つまり、アプリケーションの「インストール」に似た1回限りのアクティビティです。このソリューションは失格になりますか？

— ムージー

回答:

Python with PIL、スコア0.094218

コンプレッサー：

#!/usr/bin/env python
from __future__ import division
import sys, traceback, os
from PIL import Image
from fractions import Fraction
import time, io

def image_bytes(img, scale):
    w,h = [int(dim*scale) for dim in img.size]
    bio = io.BytesIO()
    img.resize((w,h), Image.LANCZOS).save(bio, format='PPM')
    return len(bio.getvalue())

def compress(img):
    w,h = img.size
    w1,w2 = w // 256, w % 256
    h1,h2 = h // 256, h % 256
    n = w*h
    total_size = 4*1024 - 8 #4 KiB minus 8 bytes for
                            # original and new sizes
    #beginning guess for the optimal scaling
    scale = Fraction(total_size, image_bytes(img, 1))
    #now we do a binary search for the optimal dimensions,
    # with the restriction that we maintain the scale factor
    low,high = Fraction(0),Fraction(1)
    best = None
    start_time = time.time()
    iter_count = 0
    while iter_count < 100: #scientifically chosen, not arbitrary at all
        #make sure we don't take longer than 5 minutes for the whole program
        #10 seconds is more than reasonable for the loading/saving
        if time.time() - start_time >= 5*60-10:
            break
        size = image_bytes(img, scale)
        if size > total_size:
            high = scale
        elif size < total_size:
            low = scale
            if best is None or total_size-size < best[1]:
                best = (scale, total_size-size)
        else:
            break
        scale = (low+high)/2
        iter_count += 1
    w_new, h_new = [int(dim*best[0]) for dim in (w,h)]
    wn1,wn2 = w_new // 256, w_new % 256
    hn1, hn2 = h_new // 256, h_new % 256
    i_new = img.resize((w_new, h_new), Image.LANCZOS)
    bio = io.BytesIO()
    i_new.save(bio, format='PPM')
    return ''.join(map(chr, (w1,w2,h1,h2,wn1,wn2,hn1,hn2))) + bio.getvalue()

if __name__ == '__main__':
    for f in sorted(os.listdir(sys.argv[1])):
        try:
            print("Compressing {}".format(f))
            with Image.open(os.path.join(sys.argv[1],f)) as img:
                with open(os.path.join(sys.argv[2], f), 'wb') as out:
                    out.write(compress(img.convert(mode='RGB')))
        except:
            print("Exception with {}".format(f))
            traceback.print_exc()
            continue

減圧装置：

#!/usr/bin/env python
from __future__ import division
import sys, traceback, os
from PIL import Image
from fractions import Fraction
import io

def process_rect(rect):
    return rect

def decompress(compressed):
    w1,w2,h1,h2,wn1,wn2,hn1,hn2 = map(ord, compressed[:8])
    w,h = (w1*256+w2, h1*256+h2)
    wc, hc = (wn1*256+wn2, hn1*256+hn2)
    img_bytes = compressed[8:]
    bio = io.BytesIO(img_bytes)
    img = Image.open(bio)
    return img.resize((w,h), Image.LANCZOS)


if __name__ == '__main__':
    for f in sorted(os.listdir(sys.argv[1])):
        try:
            print("Decompressing {}".format(f))
            with open(os.path.join(sys.argv[1],f), 'rb') as img:
                decompress(img.read()).save(os.path.join(sys.argv[2],f))
        except:
            print("Exception with {}".format(f))
            traceback.print_exc()
            continue

両方のスクリプトは、コマンドライン引数を介して2つのディレクトリ（入力および出力）として入力を受け取り、入力ディレクトリ内のすべての画像を変換します。

アイデアは、4 KiB未満に収まり、オリジナルと同じアスペクト比を持つサイズを見つけ、ランチョスフィルターを使用して、ダウンサンプリングされた画像を可能な限り高品質にすることです。

元のサイズにサイズ変更した後の圧縮画像のImgurアルバム

スコアリングスクリプトの出力：

Comparing corpus/1 - starry.png to test/1 - starry.png... 0.159444
Comparing corpus/2 - source.png to test/2 - source.png... 0.103666
Comparing corpus/3 - room.png to test/3 - room.png... 0.065547
Comparing corpus/4 - rainbow.png to test/4 - rainbow.png... 0.001020
Comparing corpus/5 - margin.png to test/5 - margin.png... 0.282746
Comparing corpus/6 - llama.png to test/6 - llama.png... 0.057997
Comparing corpus/7 - kid.png to test/7 - kid.png... 0.061476
Comparing corpus/8 - julia.png to test/8 - julia.png... 0.021848
Average score: 0.094218

— メゴ
ソース

あなたのソリューションはWebPを使用していることに気付きましたが、これは許可されていません。ソリューションが無効です。

— orlp

@orlp非圧縮形式のPPMを使用するように修正されました。

— メゴ

わかった。ただし、この課題はDSSIMのかなりの弱点を明らかにします。Moogieの画像のほとんどはかなり良く見えると主張します。

— orlp

@orlpそれらはサムネイルとして見栄えがします。（Lanczosを使用して）元のサイズに拡大すると、ほぼ同じ品質またはそれよりも悪く見えます。アップロードした出力のサムネイルの取得に取り組んでいます。

— メゴ

Java（バニラ、Java 1.5以降で動作するはず）、スコア0.672

それは特に良いdssimスコアを生成しませんが、私の目にはより人間に優しい画像を生成します...

アルバム：http : //imgur.com/a/yL31U

スコアリングスクリプトの出力：

Comparing corpus/1 - starry.png to test/1 - starry.png... 2.3521
Comparing corpus/2 - source.png to test/2 - source.png... 1.738
Comparing corpus/3 - room.png to test/3 - room.png... 0.1829
Comparing corpus/4 - rainbow.png to test/4 - rainbow.png... 0.0633
Comparing corpus/5 - margin.png to test/5 - margin.png... 0.4224
Comparing corpus/6 - llama.png to test/6 - llama.png... 0.204
Comparing corpus/7 - kid.png to test/7 - kid.png... 0.36335
Comparing corpus/8 - julia.png to test/8 - julia.png... 0.05
Average score: 0.672

圧縮チェーン：

1. if filter data has already been generated goto step 6
2. create sample images from random shapes and colours
3. take sample patches from these sample images
4. perform k-clustering of patches based on similarity of luminosity and chomanosity.
5. generate similarity ordered lists for each patch as compared to the other patches.
6. read in image
7. reduce image size to current multiplier * blocksize
8. iterate over image comparing each blocksize block against the list of clustered luminosity patches and chromanosity patches, selecting the closest match
9. output the index of the closet match from the similarity ordered list (of the previous block) (repeat 8 for chromanosity)
10. perform entropy encoding using deflate.
11. if output size is < 4096 bytes then increment current multiplier and repeat step 7
12. write previous output to disk.

解凍するには、ブロックインデックスを展開してから読み取り、対応するパッチを出力ファイルに出力してから、元のサイズにサイズ変更します。

残念ながら、コードはstackoverflowには大きすぎるため、https：//gist.github.com/anonymous/989ab8a1bb6ec14f6ea9で見つけることができます

走る：

Usage: 
       For single image compression: java CompressAnImageToA4kibPreview -c <INPUT IMAGE> [<COMPRESSED IMAGE>]
       For multiple image compression: java CompressAnImageToA4kibPreview -c <INPUT IMAGES DIR> [<COMPRESSED IMAGE DIR>]
       For single image decompression: java CompressAnImageToA4kibPreview -d <COMPRESSED IMAGE> [<DECOMPRESSED IMAGE>
       For multiple image decompression: java CompressAnImageToA4kibPreview -d <COMPRESSED IMAGE DIR> [<DECOMPRESSED IMAGES DIR>]

If optional parameters are not set then defaults will be used:
       For single image compression, compressed image will be created in same directory are input image and have '.compressed' file extension.
       For multiple image compression, compressed images will be created in a new 'out' sub directory of <INPUT IMAGES DIR> and have '.compressed' file extensions.
       For single image decompression, decompressed image will be created in same directory are input image and have '.out.png' file extension.
       For multiple image decompression, decompressed images will be created a new 'out' sub directory of <COMPRESSED IMAGE DIR> and have '.png' file extensions.

このアプリケーションを初めて実行すると、必要なファイルが生成され、実行作業ディレクトリに関連するディレクトリに保存されます。これには数分かかる場合があります。以降の実行では、このステップを実行する必要はありません。

— ムージー
ソース

これはすごいですね。確認するために、手順1〜6はコーパスにまったく依存していませんか？また、代わりにgist.github.comでコードを再ホストしても構いませんか？

— orlp

正しい、入力としてコーパスファイルを使用しません。定数「OUTPUT_SAMPLE_IMAGES」をtrueに変更してパッチ購入を生成するために生成するイメージを確認できます。これらの画像を作業フォルダーに出力します：data / images / working /

— Moogie

@orlpは現在gist.githubを使用しています

— Moogie

結果は驚くべきものですが、deflate / inflateを使用すると、一般的な圧縮/解凍を許可しないというルールに違反しませんか？

— アルグミー

@algmyrしばらく経ちましたが、ジェネリックな圧縮ルールをジェネリックな「イメージ」圧縮を意味しないと解釈したと思います。つまり、jpegなどです。しかし、間違って解釈した可能性があります提出は失格となります。

— ムージー