Automatische Erkennung des Drehwinkels auf einem beliebigen Bild mit orthogonalen Merkmalen

Ich habe eine Aufgabe zur Hand, bei der ich den Winkel eines Bildes wie das folgende Beispiel (Teil des Mikrochip-Fotos) erkennen muss. Das Bild enthält zwar orthogonale Merkmale, diese können jedoch eine unterschiedliche Größe und Auflösung / Schärfe aufweisen. Das Bild ist aufgrund einiger optischer Verzerrungen und Aberrationen leicht unvollkommen. Eine Genauigkeit der Erkennung von Subpixelwinkeln ist erforderlich (dh sie sollte deutlich unter einem Fehler von <0,1 ° liegen, etwa 0,01 ° wären tolerierbar). Als Referenz für dieses Bild liegt der optimale Winkel bei 32,19 °.

Derzeit habe ich zwei Ansätze ausprobiert: Beide führen eine Brute-Force-Suche nach einem lokalen Minimum mit einem Schritt von 2 ° durch und senken dann den Gradienten auf eine Schrittgröße von 0,0001 °.

Die Verdienstfunktion wird sum(pow(img(x+1)-img(x-1), 2) + pow(img(y+1)-img(y-1))über das Bild berechnet. Wenn horizontale / vertikale Linien ausgerichtet sind, ändert sich die horizontale / vertikale Richtung weniger. Die Genauigkeit betrug etwa 0,2 °.
Die Verdienstfunktion ist (max-min) über eine Streifenbreite / -höhe des Bildes. Dieser Streifen wird auch über das Bild geschleift, und die Leistungsfunktion wird akkumuliert. Dieser Ansatz konzentriert sich auch auf kleinere Helligkeitsänderungen, wenn horizontale / vertikale Linien ausgerichtet sind. Er kann jedoch kleinere Änderungen über eine größere Basis (Streifenbreite - die etwa 100 Pixel breit sein kann) erkennen. Dies ergibt eine bessere Präzision von bis zu 0,01 ° - es müssen jedoch viele Parameter angepasst werden (Streifenbreite / -höhe ist beispielsweise sehr empfindlich), die in der realen Welt unzuverlässig sein können.

Der Kantenerkennungsfilter hat nicht viel geholfen.

Mein Anliegen ist eine sehr kleine Änderung der Leistungsfunktion in beiden Fällen zwischen dem schlechtesten und dem besten Winkel (<2x Unterschied).

Haben Sie bessere Vorschläge zum Schreiben der Leistungsfunktion für die Winkelerkennung?

Update: Hier wird ein Beispielbild in voller Größe hochgeladen (51 MiB)

Nach all der Verarbeitung wird es so aussehen.

image image-processing computer-vision

— BarsMonster
quelle

Es ist sehr traurig, dass es von Stackoverflow auf DSP umgestellt wurde. Ich sehe hier keine DSP-ähnliche Lösung, und die Chancen sind jetzt stark reduziert. 99,9% der DSP-Algorithmen und -Tricks sind für diese Aufgabe unbrauchbar. Es scheint, dass hier ein benutzerdefinierter Algorithmus oder Ansatz erforderlich ist, keine FFT.

— BarsMonster

Ich freue mich sehr, Ihnen sagen zu können, dass es völlig falsch ist, traurig zu sein. DSP.SE ist der absolut richtige Ort, um dies zu fragen! (Nicht so viel Stackoverflow. Es ist keine Programmierfrage. Sie kennen Ihre Programmierung. Sie wissen nicht, wie Sie dieses Bild verarbeiten sollen.) Bilder sind Signale, und DSP.SE beschäftigt sich sehr mit der Bildverarbeitung! Außerdem sind viele allgemeine DSP-Tricks (auch bekannt als z. B. Kommunikationssignale) sehr gut auf Ihr Problem anwendbar :)

— Marcus Müller

Wie wichtig ist Effizienz?

— Cedron Dawg

Übrigens, selbst wenn Sie mit einer Auflösung von 0,04 ° arbeiten, bin ich mir ziemlich sicher, dass die Drehung genau 32 ° und nicht 32,19 ° beträgt - wie hoch ist die Auflösung Ihrer Originalfotografie? Denn bei einer Breite von 800 px beträgt eine unkorrigierte Drehung von 0,01 ° nur einen Höhenunterschied von 0,14 px, und dies wäre selbst bei einer starken Interpolation kaum wahrnehmbar.

— Marcus Müller

@CedronDawg Definitiv keine Echtzeitanforderungen, ich kann einige 10-60 Sekunden Berechnung auf einigen 8-12 Kernen tolerieren.

— BarsMonster

Antworten:

Wenn ich Ihre Methode 1 richtig verstehe, würden Sie damit, wenn Sie einen kreisförmig symmetrischen Bereich verwenden und die Drehung um den Mittelpunkt des Bereichs durchführen, die Abhängigkeit des Bereichs vom Drehwinkel beseitigen und einen faireren Vergleich durch die Leistungsfunktion zwischen erhalten verschiedene Drehwinkel. Ich werde eine Methode vorschlagen, die im Wesentlichen dieser Methode entspricht, jedoch das Vollbild verwendet und keine wiederholte Bildrotation erfordert. Sie umfasst eine Tiefpassfilterung zum Entfernen der Pixelgitteranisotropie und zum Entrauschen.

Gradient des isotrop tiefpassgefilterten Bildes

Berechnen wir zunächst einen lokalen Gradientenvektor an jedem Pixel für den grünen Farbkanal im Beispielbild in voller Größe.

Ich habe horizontale und vertikale Differenzierungskerne abgeleitet, indem ich die Impulsantwort im kontinuierlichen Raum eines idealen Tiefpassfilters mit einem flachen kreisförmigen Frequenzgang differenziert habe, der den Effekt der Wahl der Bildachsen beseitigt, indem sichergestellt wird, dass kein unterschiedlicher Detaillierungsgrad diagonal verglichen wird horizontal oder vertikal durch Abtasten der resultierenden Funktion und Anwenden eines gedrehten Kosinusfensters:

\begin{matrix} (1) & \begin{matrix} h_{x} [x, y] = {\begin{cases} 0 & if x = y = 0, \\ - \frac{ω_{c}^{2} x J_{2} (ω_{c} \sqrt{x^{2} + y^{2}})}{2 π (x^{2} + y^{2})} & otherwise, \end{cases} \\ h_{y} [x, y] = {\begin{cases} 0 & if x = y = 0, \\ - \frac{ω_{c}^{2} y J_{2} (ω_{c} \sqrt{x^{2} + y^{2}})}{2 π (x^{2} + y^{2})} & otherwise, \end{cases} \end{matrix} \end{matrix}

$\begin{gather}h_x[x, y] = \begin{cases}0&\text{if }x = y = 0,\\-\displaystyle\frac{\omega_c^2\,x\,J_2\left(\omega_c\sqrt{x^2 + y^2}\right)}{2 \pi\,(x^2 + y^2)}&\text{otherwise,}\end{cases}\\ h_y[x, y] = \begin{cases}0&\text{if }x = y = 0,\\-\displaystyle\frac{\omega_c^2\,y\,J_2\left(\omega_c\sqrt{x^2 + y^2}\right)}{2 \pi\,(x^2 + y^2)}&\text{otherwise,}\end{cases}\end{gather}\tag{1}$

wobei $J_2$ eine Bessel-Funktion 2. Ordnung der ersten Art ist und $\omega_c$ die Grenzfrequenz im Bogenmaß ist. Python-Quelle (hat nicht die Minuszeichen von Gleichung 1):

import matplotlib.pyplot as plt
import scipy
import scipy.special
import numpy as np

def rotatedCosineWindow(N):  # N = horizontal size of the targeted kernel, also its vertical size, must be odd.
  return np.fromfunction(lambda y, x: np.maximum(np.cos(np.pi/2*np.sqrt(((x - (N - 1)/2)/((N - 1)/2 + 1))**2 + ((y - (N - 1)/2)/((N - 1)/2 + 1))**2)), 0), [N, N])

def circularLowpassKernelX(omega_c, N):  # omega = cutoff frequency in radians (pi is max), N = horizontal size of the kernel, also its vertical size, must be odd.
  kernel = np.fromfunction(lambda y, x: omega_c**2*(x - (N - 1)/2)*scipy.special.jv(2, omega_c*np.sqrt((x - (N - 1)/2)**2 + (y - (N - 1)/2)**2))/(2*np.pi*((x - (N - 1)/2)**2 + (y - (N - 1)/2)**2)), [N, N])
  kernel[(N - 1)//2, (N - 1)//2] = 0
  return kernel

def circularLowpassKernelY(omega_c, N):  # omega = cutoff frequency in radians (pi is max), N = horizontal size of the kernel, also its vertical size, must be odd.
  kernel = np.fromfunction(lambda y, x: omega_c**2*(y - (N - 1)/2)*scipy.special.jv(2, omega_c*np.sqrt((x - (N - 1)/2)**2 + (y - (N - 1)/2)**2))/(2*np.pi*((x - (N - 1)/2)**2 + (y - (N - 1)/2)**2)), [N, N])
  kernel[(N - 1)//2, (N - 1)//2] = 0
  return kernel

N = 41  # Horizontal size of the kernel, also its vertical size. Must be odd.
window = rotatedCosineWindow(N)

# Optional window function plot
#plt.imshow(window, vmin=-np.max(window), vmax=np.max(window), cmap='bwr')
#plt.colorbar()
#plt.show()

omega_c = np.pi/4  # Cutoff frequency in radians <= pi
kernelX = circularLowpassKernelX(omega_c, N)*window
kernelY = circularLowpassKernelY(omega_c, N)*window

# Optional kernel plot
#plt.imshow(kernelX, vmin=-np.max(kernelX), vmax=np.max(kernelX), cmap='bwr')
#plt.colorbar()
#plt.show()

Abbildung 1. 2-d gedrehtes Kosinusfenster.

Abbildung 2. Horizontale isotrope Tiefpass-Differenzierungskerne mit Fenster für unterschiedliche Einstellungen der Grenzfrequenz $\omega_c$ . Oben : omega_c = np.pi, Mitte : omega_c = np.pi/4, Unten : omega_c = np.pi/16. Das Minuszeichen von Gl. Ich wurde weggelassen. Vertikale Kernel sehen gleich aus, wurden jedoch um 90 Grad gedreht. Eine gewichtete Summe des horizontalen und des vertikalen Kerns mit den Gewichten $\cos(\phi)$ bzw. $\sin(\phi)$ ergibt einen Analysekern des gleichen Typs für den Gradientenwinkel $\phi$ .

Die Differenzierung der Impulsantwort hat keinen Einfluss auf die Bandbreite, wie aus der 2-D-Fast-Fourier-Transformation (FFT) in Python hervorgeht:

# Optional FFT plot
absF = np.abs(np.fft.fftshift(np.fft.fft2(circularLowpassKernelX(np.pi, N)*window)))
plt.imshow(absF, vmin=0, vmax=np.max(absF), cmap='Greys', extent=[-np.pi, np.pi, -np.pi, np.pi])
plt.colorbar()
plt.show()

Abbildung 3. Größe der 2-d-FFT von $h_x$ . Im Frequenzbereich erscheint die Differenzierung als Multiplikation des flachen kreisförmigen Durchlassbandes mit $\omega_x$ und mit einer Phasenverschiebung von 90 Grad, die in der Größe nicht sichtbar ist.

Um die Faltung für den grünen Kanal durchzuführen und ein 2-D-Gradientenvektorhistogramm zur visuellen Überprüfung in Python zu sammeln:

import scipy.ndimage

img = plt.imread('sample.tif').astype(float)
X = scipy.ndimage.convolve(img[:,:,1], kernelX)[(N - 1)//2:-(N - 1)//2, (N - 1)//2:-(N - 1)//2]  # Green channel only
Y = scipy.ndimage.convolve(img[:,:,1], kernelY)[(N - 1)//2:-(N - 1)//2, (N - 1)//2:-(N - 1)//2]  # ...

# Optional 2-d histogram
#hist2d, xEdges, yEdges = np.histogram2d(X.flatten(), Y.flatten(), bins=199)
#plt.imshow(hist2d**(1/2.2), vmin=0, cmap='Greys')
#plt.show()
#plt.imsave('hist2d.png', plt.cm.Greys(plt.Normalize(vmin=0, vmax=hist2d.max()**(1/2.2))(hist2d**(1/2.2))))  # To save the histogram image
#plt.imsave('histkey.png', plt.cm.Greys(np.repeat([(np.arange(200)/199)**(1/2.2)], 16, 0)))

Dadurch werden auch die Daten zugeschnitten, wobei (N - 1)//2Pixel von jeder Kante, die durch die rechteckige Bildgrenze verunreinigt waren, vor der Histogrammanalyse verworfen werden.

$\pi$ $\frac{\pi}{2}$ $\frac{\pi}{4}$
$\frac{\pi}{8}$ $\frac{\pi}{16}$ $\frac{\pi}{32}$ $\frac{\pi}{64}$ - $0$
Abbildung 4. 2D-Histogramme von Gradientenvektoren für verschiedeneEinstellungen derTiefpassfilter-Grenzfrequenz $\omega_c$ . Um: zuerst mitN=41:omega_c = np.pi,omega_c = np.pi/2,omega_c = np.pi/4(gleiche wie in der PythonListing)omega_c = np.pi/8,omega_c = np.pi/16dann:N=81:omega_c = np.pi/32,N=161:omega_c = np.pi/64. Das Entrauschen durch Tiefpassfilterung schärft die Gradientenorientierungen der Schaltungsspurkante im Histogramm.

Vektorlängengewichtete kreisförmige mittlere Richtung

Es gibt die Yamartino-Methode zum Ermitteln der "durchschnittlichen" Windrichtung aus mehreren Windvektorproben in einem Durchgang durch die Proben. Sie basiert auf dem Mittelwert der Kreisgrößen , der als Verschiebung eines Kosinus berechnet wird, der eine Summe von Kosinus ist, die jeweils um eine Kreisgröße der Periode $2\pi$ verschoben sind . Wir können eine vektorlängengewichtete Version derselben Methode verwenden, aber zuerst müssen wir alle Richtungen zusammenfassen, die gleich modulo $\pi/2$ . Wir können dies tun, indem wir den Winkel jedes Gradientenvektors $[X_k,Y_k]$ mit 4 multiplizieren, indem wir eine komplexe Zahlendarstellung verwenden:

\begin{matrix} (2) & Z_{k} = \frac{(X_{k} + Y_{k} i)^{4}}{{\sqrt{X_{k}^{2} + Y_{k}^{2}}}^{3}} = \frac{X_{k}^{4} - 6 X_{k}^{2} Y_{k}^{2} + Y_{k}^{4} + (4 X_{k}^{3} Y_{k} - 4 X_{k} Y_{k}^{3}) i}{{\sqrt{X_{k}^{2} + Y_{k}^{2}}}^{3}}, \end{matrix}

$Z_k = \frac{(X_k + Y_k i)^4}{\sqrt{X_k^2 + Y_k^2}^3} = \frac{X_k^4 - 6X_k^2Y_k^2 + Y_k^4 + (4X_k^3Y_k - 4X_kY_k^3)i}{\sqrt{X_k^2 + Y_k^2}^3},\tag{2}$

Befriedigung $|Z_k| = \sqrt{X_k^2 + Y_k^2}$ und durch spätere Interpretation, dass die Phasen von $Z_k$ von $-\pi$ bis $\pi$ Winkel von $-\pi/4$ bis $\pi/4$ , durch Teilen der berechneten kreisförmigen mittleren Phase durch 4:

\begin{matrix} (3) & ϕ = \frac{1}{4} atan2 (\sum_{k} Im (Z_{k}), \sum_{k} Re (Z_{k})) \end{matrix}

$\phi = \frac{1}{4}\operatorname{atan2}\left(\sum_k\operatorname{Im}(Z_k), \sum_k\operatorname{Re}(Z_k)\right)\tag{3}$

Dabei ist $\phi$ die geschätzte Bildorientierung.

Die Qualität der Schätzung kann indem einen zweiten Durchlauf durch die Daten bewertet werden und indem den mittleren gewichtete Quadratberechnungskreisabstand , $\text{MSCD}$ , zwischen den Phasen der komplexen Zahlen $Z_k$ und den geschätzten Phasenkreismittel $4\phi$ , mit $|Z_k|$ als das Gewicht:

\begin{matrix} (4) & \begin{matrix} MSCD = \frac{\sum_{k} | Z_{k} | (1 - \cos (4 ϕ - atan2 (Im (Z_{k}), Re (Z_{k}))))}{\sum_{k} | Z_{k} |} \\ = \frac{\sum_{k} \frac{| Z_{k} |}{2} ({(\cos (4 ϕ) - \frac{Re (Z_{k})}{| Z_{k} |})}^{2} + {(\sin (4 ϕ) - \frac{Im (Z_{k})}{| Z_{k} |})}^{2})}{\sum_{k} | Z_{k} |} \\ = \frac{\sum_{k} (| Z_{k} | - Re (Z_{k}) \cos (4 ϕ) - Im (Z_{k}) \sin (4 ϕ))}{\sum_{k} | Z_{k} |}, \end{matrix} \end{matrix}

$\begin{gather}\text{MSCD} = \frac{\sum_k|Z_k|\bigg(1 - \cos\Big(4\phi - \operatorname{atan2}\big(\operatorname{Im}(Z_k), \operatorname{Re}(Z_k)\big)\Big)\bigg)}{\sum_k|Z_k|}\\ = \frac{\sum_k\frac{|Z_k|}{2}\left(\left(\cos(4\phi) - \frac{\operatorname{Re}(Z_k)}{|Z_k|}\right)^2 + \left(\sin(4\phi) - \frac{\operatorname{Im}(Z_k)}{|Z_k|}\right)^2\right)}{\sum_k|Z_k|}\\ = \frac{\sum_k\big(|Z_k| - \operatorname{Re}(Z_k)\cos(4\phi) - \operatorname{Im}(Z_k)\sin(4\phi)\big)}{\sum_k|Z_k|},\end{gather}\tag{4}$

$\phi$

absZ = np.sqrt(X**2 + Y**2)
reZ = (X**4 - 6*X**2*Y**2 + Y**4)/absZ**3
imZ = (4*X**3*Y - 4*X*Y**3)/absZ**3
phi = np.arctan2(np.sum(imZ), np.sum(reZ))/4

sumWeighted = np.sum(absZ - reZ*np.cos(4*phi) - imZ*np.sin(4*phi))
sumAbsZ = np.sum(absZ)
mscd = sumWeighted/sumAbsZ

print("rotate", -phi*180/np.pi, "deg, RMSCD =", np.arccos(1 - mscd)/4*180/np.pi, "deg equivalent (weight = length)")

Aufgrund meiner mpmathExperimente (nicht gezeigt) denke ich, dass uns die numerische Genauigkeit auch bei sehr großen Bildern nicht ausgehen wird. Für verschiedene Filtereinstellungen (mit Anmerkungen versehen) liegen die Ausgänge zwischen -45 und 45 Grad:

rotate 32.29809399495655 deg, RMSCD = 17.057059965741338 deg equivalent (omega_c = np.pi)
rotate 32.07672617150525 deg, RMSCD = 16.699056648843566 deg equivalent (omega_c = np.pi/2)
rotate 32.13115293914797 deg, RMSCD = 15.217534399922902 deg equivalent (omega_c = np.pi/4, same as in the Python listing)
rotate 32.18444156018288 deg, RMSCD = 14.239347706786056 deg equivalent (omega_c = np.pi/8)
rotate 32.23705383489169 deg, RMSCD = 13.63694582160468 deg equivalent (omega_c = np.pi/16)

$\operatorname{acos}(1 - \text{MSCD})$

Alternative Gewichtsfunktion mit quadratischer Länge

Versuchen wir das Quadrat der Vektorlänge als alternative Gewichtsfunktion durch:

\begin{matrix} (5) & Z_{k} = \frac{(X_{k} + Y_{k} i)^{4}}{{\sqrt{X_{k}^{2} + Y_{k}^{2}}}^{2}} = \frac{X_{k}^{4} - 6 X_{k}^{2} Y_{k}^{2} + Y_{k}^{4} + (4 X_{k}^{3} Y_{k} - 4 X_{k} Y_{k}^{3}) i}{X_{k}^{2} + Y_{k}^{2}}, \end{matrix}

$Z_k = \frac{(X_k + Y_k i)^4}{\sqrt{X_k^2 + Y_k^2}^2} = \frac{X_k^4 - 6X_k^2Y_k^2 + Y_k^4 + (4X_k^3Y_k - 4X_kY_k^3)i}{X_k^2 + Y_k^2},\tag{5}$

In Python:

absZ_alt = X**2 + Y**2
reZ_alt = (X**4 - 6*X**2*Y**2 + Y**4)/absZ_alt
imZ_alt = (4*X**3*Y - 4*X*Y**3)/absZ_alt
phi_alt = np.arctan2(np.sum(imZ_alt), np.sum(reZ_alt))/4

sumWeighted_alt = np.sum(absZ_alt - reZ_alt*np.cos(4*phi_alt) - imZ_alt*np.sin(4*phi_alt))
sumAbsZ_alt = np.sum(absZ_alt)
mscd_alt = sumWeighted_alt/sumAbsZ_alt

print("rotate", -phi_alt*180/np.pi, "deg, RMSCD =", np.arccos(1 - mscd_alt)/4*180/np.pi, "deg equivalent (weight = length^2)")

Das quadratische Längengewicht reduziert den RMSCD-Äquivalentwinkel um etwa einen Grad:

rotate 32.264713568426764 deg, RMSCD = 16.06582418749094 deg equivalent (weight = length^2, omega_c = np.pi, N = 41)
rotate 32.03693157762725 deg, RMSCD = 15.839593856962486 deg equivalent (weight = length^2, omega_c = np.pi/2, N = 41)
rotate 32.11471435914187 deg, RMSCD = 14.315371970649874 deg equivalent (weight = length^2, omega_c = np.pi/4, N = 41)
rotate 32.16968341455537 deg, RMSCD = 13.624896827482049 deg equivalent (weight = length^2, omega_c = np.pi/8, N = 41)
rotate 32.22062839958777 deg, RMSCD = 12.495324176281466 deg equivalent (weight = length^2, omega_c = np.pi/16, N = 41)
rotate 32.22385477783647 deg, RMSCD = 13.629915935941973 deg equivalent (weight = length^2, omega_c = np.pi/32, N = 81)
rotate 32.284350817263906 deg, RMSCD = 12.308297934977746 deg equivalent (weight = length^2, omega_c = np.pi/64, N = 161)

$\omega_c = \pi/32$ $\omega_c = \pi/64$ N

1-d-Histogramm

$Z_k$

# Optional histogram
hist_plain, bin_edges = np.histogram(np.arctan2(imZ, reZ), weights=np.ones(absZ.shape)/absZ.size, bins=900)
hist, bin_edges = np.histogram(np.arctan2(imZ, reZ), weights=absZ/np.sum(absZ), bins=900)
hist_alt, bin_edges = np.histogram(np.arctan2(imZ_alt, reZ_alt), weights=absZ_alt/np.sum(absZ_alt), bins=900)
plt.plot((bin_edges[:-1]+(bin_edges[1]-bin_edges[0]))*45/np.pi, hist_plain, "black")
plt.plot((bin_edges[:-1]+(bin_edges[1]-bin_edges[0]))*45/np.pi, hist, "red")
plt.plot((bin_edges[:-1]+(bin_edges[1]-bin_edges[0]))*45/np.pi, hist_alt, "blue")
plt.xlabel("angle (degrees)")
plt.show()

$-\pi/4\ldots\pi/4$ und gewichtet mit (in der Reihenfolge von unten nach oben am Peak): keine Gewichtung (schwarz), Gradientenvektorlänge (rot), Quadrat der Gradientenvektorlänge (blau). Die Behälterbreite beträgt 0,1 Grad. Der Filter-Cutoff war der omega_c = np.pi/4gleiche wie in der Python-Liste. Die untere Abbildung wird auf die Spitzen gezoomt.

Lenkbare Filtermathematik

Wir haben gesehen, dass der Ansatz funktioniert, aber es wäre gut, ein besseres mathematisches Verständnis zu haben. Das $x$ und $y$ differentiation filter impulse responses given by Eq. 1 can be understood as the basis functions for forming the impulse response of a steerable differentiation filter that is sampled from a rotation of the right side of the equation for $h_x[x, y]$ (Eq. 1). This is more easily seen by converting Eq. 1 to polar coordinates:

\begin{matrix} (6) & \begin{aligned} h_{x} (r, θ) = h_{x} [r \cos (θ), r \sin (θ)] & = {\begin{cases} 0 & if r = 0, \\ - \frac{ω_{c}^{2} r \cos (θ) J_{2} (ω_{c} r)}{2 π r^{2}} & otherwise \end{cases} \\ = \cos (θ) f (r), \\ h_{y} (r, θ) = h_{y} [r \cos (θ), r \sin (θ)] & = {\begin{cases} 0 & if r = 0, \\ - \frac{ω_{c}^{2} r \sin (θ) J_{2} (ω_{c} r)}{2 π r^{2}} & otherwise \end{cases} \\ = \sin (θ) f (r), \\ f (r) & = {\begin{cases} 0 & if r = 0, \\ - \frac{ω_{c}^{2} r J_{2} (ω_{c} r)}{2 π r^{2}} & otherwise, \end{cases} \end{aligned} \end{matrix}

$\begin{align}h_x(r, \theta) = h_x[r\cos(\theta), r\sin(\theta)] &= \begin{cases}0&\text{if }r = 0,\\-\displaystyle\frac{\omega_c^2\,r\cos(\theta)\,J_2\left(\omega_c r\right)}{2 \pi\,r^2}&\text{otherwise}\end{cases}\\ &= \cos(\theta)f(r),\\ h_y(r, \theta) = h_y[r\cos(\theta), r\sin(\theta)] &= \begin{cases}0&\text{if }r = 0,\\-\displaystyle\frac{\omega_c^2\,r\sin(\theta)\,J_2\left(\omega_c r\right)}{2 \pi\,r^2}&\text{otherwise}\end{cases}\\ &= \sin(\theta)f(r),\\ f(r) &= \begin{cases}0&\text{if }r = 0,\\-\displaystyle\frac{\omega_c^2\,r\,J_2\left(\omega_c r\right)}{2 \pi\,r^2}&\text{otherwise,}\end{cases}\end{align}\tag{6}$

where both the horizontal and the vertical differentiation filter impulse responses have the same radial factor function $f(r)$ . Any rotated version $h(r, \theta, \phi)$ of $h_x(r, \theta)$ by steering angle $\phi$ is obtained by:

\begin{matrix} (7) & h (r, θ, ϕ) = h_{x} (r, θ - ϕ) = \cos (θ - ϕ) f (r) \end{matrix}

$h(r, \theta, \phi) = h_x(r, \theta - \phi) = \cos(\theta - \phi)f(r)\tag{7}$

The idea was that the steered kernel $h(r, \theta, \phi)$ can be constructed as a weighted sum of $h_x(r, \theta)$ and $h_x(r, \theta)$ , with $\cos(\phi)$ and $\sin(\phi)$ as the weights, and that is indeed the case:

\begin{matrix} (8) & \cos (ϕ) h_{x} (r, θ) + \sin (ϕ) h_{y} (r, θ) = \cos (ϕ) \cos (θ) f (r) + \sin (ϕ) \sin (θ) f (r) = \cos (θ - ϕ) f (r) = h (r, θ, ϕ) . \end{matrix}

$\cos(\phi) h_x(r, \theta) + \sin(\phi) h_y(r, \theta) = \cos(\phi) \cos(\theta) f(r) + \sin(\phi) \sin(\theta) f(r) = \cos(\theta - \phi) f(r) = h(r, \theta, \phi).\tag{8}$

We will arrive at an equivalent conclusion if we think of the isotropically low-pass filtered signal as the input signal and construct a partial derivative operator with respect to the first of rotated coordinates $x_\phi$ , $y_\phi$ rotated by angle $\phi$ from coordinates $x$ , $y$ . (Derivation can be considered a linear-time-invariant system.) We have:

\begin{matrix} (9) & \begin{matrix} x = \cos (ϕ) x_{ϕ} - \sin (ϕ) y_{ϕ}, \\ y = \sin (ϕ) x_{ϕ} + \cos (ϕ) y_{ϕ} \end{matrix} \end{matrix}

$\begin{gather}x = \cos(\phi)x_\phi - \sin(\phi)y_\phi,\\ y = \sin(\phi)x_\phi + \cos(\phi)y_\phi\end{gather}\tag{9}$

Using the chain rule for partial derivatives, the partial derivative operator with respect to $x_\phi$ can be expressed as a cosine and sine weighted sum of partial derivatives with respect to $x$ and $y$ :

\begin{matrix} (10) & \begin{matrix} \frac{\partial}{\partial x_{ϕ}} = \frac{\partial x}{\partial x_{ϕ}} \frac{\partial}{\partial x} + \frac{\partial y}{\partial x_{ϕ}} \frac{\partial}{\partial y} = \frac{\partial (\cos (ϕ) x_{ϕ} - \sin (ϕ) y_{ϕ})}{\partial x_{ϕ}} \frac{\partial}{\partial x} + \frac{\partial (\sin (ϕ) x_{ϕ} + \cos (ϕ) y_{ϕ})}{\partial x_{ϕ}} \frac{\partial}{\partial y} = \cos (ϕ) \frac{\partial}{\partial x} + \sin (ϕ) \frac{\partial}{\partial y} \end{matrix} \end{matrix}

$\begin{gather}\frac{\partial}{\partial x_\phi} = \frac{\partial x}{\partial x_\phi}\frac{\partial}{\partial x} + \frac{\partial y}{\partial x_\phi}\frac{\partial}{\partial y} = \frac{\partial \big(\cos(\phi)x_\phi - \sin(\phi)y_\phi\big)}{\partial x_\phi}\frac{\partial}{\partial x} + \frac{\partial \big(\sin(\phi)x_\phi + \cos(\phi)y_\phi\big)}{\partial x_\phi}\frac{\partial}{\partial y} = \cos(\phi)\frac{\partial}{\partial x} + \sin(\phi)\frac{\partial}{\partial y}\end{gather}\tag{10}$

A question that remains to be explored is how a suitably weighted circular mean of gradient vector angles is related to the angle $\phi$ of in some way the "most activated" steered differentiation filter.

Possible improvements

To possibly improve results further, the gradient can be calculated also for the red and blue color channels, to be included as additional data in the "average" calculation.

I have in mind possible extensions of this method:

1) Use a larger set of analysis filter kernels and detect edges rather than detecting gradients. This needs to be carefully crafted so that edges in all directions are treated equally, that is, an edge detector for any angle should be obtainable by a weighted sum of orthogonal kernels. A set of suitable kernels can (I think) be obtained by applying the differential operators of Eq. 11, Fig. 6 (see also my Mathematics Stack Exchange post) on the continuous-space impulse response of a circularly symmetric low-pass filter.

\begin{matrix} (11) & \begin{matrix} lim_{h \to 0} \frac{\sum_{N = 0}^{4 N + 1} (- 1)^{n} f (x + h \cos (\frac{2 π n}{4 N + 2}), y + h \sin (\frac{2 π n}{4 N + 2}))}{h^{2 N + 1}}, \\ lim_{h \to 0} \frac{\sum_{N = 0}^{4 N + 1} (- 1)^{n} f (x + h \sin (\frac{2 π n}{4 N + 2}), y + h \cos (\frac{2 π n}{4 N + 2}))}{h^{2 N + 1}} \end{matrix} \end{matrix}

$\begin{gather}\lim_{h\to 0}\frac{\sum_{N=0}^{4N + 1} (-1)^n f\bigg(x + h\cos\left(\frac{2\pi n}{4N + 2}\right), y + h\sin\left(\frac{2\pi n}{4N + 2}\right)\bigg)}{h^{2N + 1}},\\ \lim_{h\to 0}\frac{\sum_{N=0}^{4N + 1} (-1)^n f\bigg(x + h\sin\left(\frac{2\pi n}{4N + 2}\right), y + h\cos\left(\frac{2\pi n}{4N + 2}\right)\bigg)}{h^{2N + 1}}\end{gather}\tag{11}$

Figure 6. Dirac delta relative locations in differential operators for construction of higher-order edge detectors.

2) The calculation of a (weighted) mean of circular quantities can be understood as summing of cosines of the same frequency shifted by samples of the quantity (and scaled by the weight), and finding the peak of the resulting function. If similarly shifted and scaled harmonics of the shifted cosine, with carefully chosen relative amplitudes, are added to the mix, forming a sharper smoothing kernel, then multiple peaks may appear in the total sum and the peak with the largest value can be reported. With a suitable mixture of harmonics, that would give a kind of local average that largely ignores outliers away from the main peak of the distribution.

Alternative approaches

It would also be possible to convolve the image by angle $\phi$ and angle $\phi + \pi/2$ rotated "long edge" kernels, and to calculate the mean square of the pixels of the two convolved images. The angle $\phi$ that maximizes the mean square would be reported. This approach might give a good final refinement for the image orientation finding, because it is risky to search the complete angle $\phi$ space at large steps.

Another approach is non-local methods, like cross-correlating distant similar regions, applicable if you know that there are long horizontal or vertical traces, or features that repeat many times horizontally or vertically.

— Olli Niemitalo
quelle

How accurate the result you got?

— Royi

@Royi Maybe around 0.1 deg.

— Olli Niemitalo

@OlliNiemitalo which is pretty impressive, given the limited resolution!

— Marcus Müller

@OlliNiemitalo speaking of impressive: this. answer. is. that. word's. very. definition.

— Marcus Müller

@MarcusMüller Thanks Marcus, I anticipate the first extension to be very interesting too.

— Olli Niemitalo

There is a similar DSP trick here, but I don't remember the details exactly.

I read about it somewhere, some while ago. It has to do with figuring out fabric pattern matches regardless of the orientation. So you may want to research on that.

Grab a circle sample. Do sums along spokes of the circle to get a circumference profile. Then they did a DFT on that (it is inherently circular after all). Toss the phase information (make it orientation independent) and make a comparison.

Then they could tell whether two fabrics had the same pattern.

Your problem is similar.

It seems to me, without trying it first, that the characteristics of the pre DFT profile should reveal the orientation. Doing standard deviations along the spokes instead of sums should work better, maybe both.

Now, if you had an oriented reference image, you could use their technique.

Ced

Your precision requirements are rather strict.

I gave this a whack. Taking the sum of the absolute values of the differences between two subsequent points along the spoke for each color.

Here is a graph of around the circumference. Your value is plotted with the white markers.

You can sort of see it, but I don't think this is going to work for you. Sorry.

Progress Report: Some

I've decided on a three step process.

1) Find evaluation spot.

2) Coarse Measurement

3) Fine Measurement

Currently, the first step is user intevention. It should be automatible, but I'm not bothering. I have a rough draft of the second step. There's some tweaking I want to try. Finally, I have a few candidates for the third step that is going to take testing to see which works best.

The good news is it is lighting fast. If your only purposed is to make an image look level on a web page, then your tolerances are way too strict and the coarse measurement ought to be accurate enough.

This is the coarse measurement. Each pixel is about 0.6 degrees. (Edit, actually 0.3)

Progress Report: Able to get good results

Most aren't this good, but they are cheap (and fairly local) and finding spots to get good reads is easy..... for a human. Brute force should work fine for a program.

The results can be much improved on, this is a simple baseline test. I'm not ready to do any explaining yet, nor post the code, but this screen shot ain't photoshopped.

Progress Report: The code is posted, I'm done with this for a while.

This screenshot is the program working on Marcus' 45 degree shot.

The color channels are processed independently.

A point is selected as the sweep center.

A diameter is swept through 180 degrees at discrete angles

At each angle, "volatility" is measuring across the diameter. A trace is made for each channel gathering samples. The sample value is a linear interpolation of the four corner values of whichever grid square the sample spot lands on.

For each channel trace

The samples are multiplied by a VonHann window function

A Smooth/Differ pass is made on the samples

The RMS of the Differ is used as a volatility measure

The lower row graphs are:

First is the sweep of 0 to 180 degrees, each pixel is 0.5 degrees. Second is the sweep around the selected angle, each pixel is 0.1 degrees. Third is the sweep around the selected angle, each pixel is 0.01 degrees. Fourth is the trace Differ curve

The initial selection is the minimal average volatility of the three channels. This will be close, but usually not on, the best angle. The symmetry at the trough is a better indicator than the minimum. A best fit parabola in that neighborhood should yield a very good answer.

The source code (in Gambas, PPA gambas-team/gambas3) can be found at:

https://forum.gambas.one/viewtopic.php?f=4&t=707

It is an ordinary zip file, so you don't have to install Gambas to look at the source. The files are in the ".src" subdirectory.

Removing the VonHann window yields higher accuracy because it effectively lengthens the trace, but adds wobbles. Perhaps a double VonHann would be better as the center is unimportant and a quicker onset of "when the teeter-totter hits the ground" will be detected. Accuracy can easily be improved my increasing the trace length as far as the image allows (Yes, that's automatible). A better window function, sinc?

The measures I have taken at the current settings confirm the 3.19 value +/-.03 ish.

This is just the measuring tool. There are several strategies I can think of to apply it to the image. That, as they say, is an exercise for the reader. Or in this case, the OP. I'll be trying my own later.

There's head room for improvement in both the algorithm and the program, but already they are really useful.

Here is how the linear interpolation works

'---- Whole Number Portion

        x = Floor(rx)
        y = Floor(ry)

'---- Fractional Portions

        fx = rx - x
        fy = ry - y

        gx = 1.0 - fx
        gy = 1.0 - fy

'---- Weighted Average

        vtl = ArgValues[x, y] * gx * gy         ' Top Left
        vtr = ArgValues[x + 1, y] * fx * gy     ' Top Right
        vbl = ArgValues[x, y + 1] * gx * fy     ' Bottom Left
        vbr = ArgValues[x + 1, y + 1] * fx * fy ' Bottom Rigth

        v = vtl + vtr + vbl + vbr

Anybody know the conventional name for that?

— Cedron Dawg
quelle

hey, you don't need to be sorry for something that was a very clever approach, and might be super helpful for someone with a similar problem who'll come here later! +1

— Marcus Müller

@BarsMonster, I am making good progess. You will want to install Gambas (PPA: gambas-team/gambas3) on your Linux box. (Likely, you too Marcus and Olli, if you can.) I'm working on a program that will not only tackle this problem, but will also serve as a good base for other image processing tasks.

— Cedron Dawg

looking forward!

— Marcus Müller

@CedronDawg that's called bilinear interpolation, here's why, indicating also to an alternative implementation.

— Olli Niemitalo

@OlliNiemitalo,Thanks Olli. In this situation, I don't think going bicubic would improve results over bilinear, in fact, it may even be detrimental. Later, I will play around with different volatility metrics along the diameter, and different shaped window function. At this point I am thinking of using a VonHann at the ends of the diameter like paddles or "teeter-totter seats hitting the mud". The flat bottom in the curve is where the teeter-totter hasn't his the ground (edge) yet. Half way between the two corners is a good read. The current settings are good to less than 0.1 degrees,

— Cedron Dawg

Rather performance intensive, but should get you accuracy as wanted:

Edge detect the image
Hough transform to a space where you have enough pixels for the wanted accuracy.
Because there are enough orthogonal lines; the image in the hough space will contain maxima lying on two lines. These are easily detectable and give you the desired angle.

— RobAu
quelle

Nice, exactly my approach: I'm kind of sad that I didn't see it before I went on my train ride and thus didn't incorporate it in my answer. A clear +1!

— Marcus Müller

I've went ahead and basically adjusted the Hough transform example of opencv to your use case. The idea is nice, but since your image already has plenty of edges due to its edgy nature, the edge detection shouldn't have much benefit.

So, what I did above said example was

Omit the edge detection
decompose your input image into color channels and process them separately
count the occurrences of lines in a specific angle (after quantizing the angles and taking them modulo 90°, since you have plenty right angles)
combine the counters of the color channels
correct these rotations

What you could do to further improve the quality of estimation (as you'll see below, the top guess wasn't right – the second was) would probably amount to converting of the image to a grayscale image that represents the actual differences between different materials best – clearly, the RGB channels aren't the best. You're the semiconductor expert, so find a way to combine the color channels in a way that maximizes the difference between e.g. metallization and silicon.

My jupyter notebook is here. See the results below.

To increase the angular resolution, increase the QUANT_STEP variable, and the angular precision in the hough_transform call. I didn't, because I wanted this code to be written in < 20 min, and thus didn't want to invest a minute in computation.

import cv2
import numpy
from matplotlib import pyplot
import collections

QUANT_STEPS = 360*2

def quantized_angle(line, quant = QUANT_STEPS):
    theta = line[0][1]
    return numpy.round(theta / numpy.pi / 2 * QUANT_STEPS) / QUANT_STEPS * 360 % 90

def detect_rotation(monochromatic_img):
    # edges = cv2.Canny(monochromatic_img, 50, 150, apertureSize = 3) #play with these parameters
    lines = cv2.HoughLines(monochromatic_img, #input
                           1, # rho resolution [px]
                           numpy.pi/180, # angular resolution [radian]
                           200) # accumulator threshold – higher = fewer candidates
    counter = collections.Counter(quantized_angle(line) for line in lines)
    return counter

img = cv2.imread("/tmp/HIKRe.jpg") #Image directly as grabbed from imgur.com
total_count = collections.Counter()
for channel in range(img.shape[-1]):
    total_count.update(detect_rotation(img[:,:,channel]))

most_common = total_count.most_common(5)

for angle,_ in most_common:
    pyplot.figure(figsize=(8,6), dpi=100)
    pyplot.title(f"{angle:.3f}°")
    rotation = cv2.getRotationMatrix2D((img.shape[0]/2, img.shape[1]/2), -angle, 1)
    pyplot.imshow(cv2.warpAffine(img, rotation, img.shape[:2]))

— Marcus Müller
quelle

This is a go at the first suggested extension of my previous answer.

Ideal circularly symmetric band-limiting filters

We construct an orthogonal bank of four filters bandlimited to inside a circle of radius $\omega_c$ on the frequency plane. The impulse responses of these filters can be linearly combined to form directional edge detection kernels. An arbitrarily normalized set of orthogonal filter impulse responses are obtained by applying the first two pairs of "beach-ball like" differential operators to the continuous-space impulse response of the circularly symmetric ideal band-limiting filter impulse response $h(x,y)$ :

\begin{matrix} (1) & h (x, y) = \frac{ω_{c}}{2 π \sqrt{x^{2} + y^{2}}} J_{1} (ω_{c} \sqrt{x^{2} + y^{2}}) \end{matrix}

$h(x,y) = \frac{\omega_c}{2\pi \sqrt{x^2 + y^2} } J_1 \big( \omega_c \sqrt{x^2 + y^2} \big)\tag{1}$

\begin{matrix} (2) & \begin{aligned} h_{0 x} (x, y) & \propto \frac{d}{d x} h (x, y), \\ h_{0 y} (x, y) & \propto \frac{d}{d y} h (x, y), \\ h_{1 x} (x, y) & \propto ({(\frac{d}{d x})}^{3} - 3 \frac{d}{d x} {(\frac{d}{d y})}^{2}) h (x, y), \\ h_{1 y} (x, y) & \propto ({(\frac{d}{d y})}^{3} - 3 \frac{d}{d y} {(\frac{d}{d x})}^{2}) h (x, y) \end{aligned} \end{matrix}

$\begin{align}h_{0x}(x, y) &\propto \frac{d}{dx}h(x, y),\\ h_{0y}(x, y) &\propto \frac{d}{dy}h(x, y),\\ h_{1x}(x, y) &\propto \left(\left(\frac{d}{dx}\right)^3-3\frac{d}{dx}\left(\frac{d}{dy}\right)^2\right)h(x, y),\\ h_{1y}(x, y) &\propto \left(\left(\frac{d}{dy}\right)^3-3\frac{d}{dy}\left(\frac{d}{dx}\right)^2\right)h(x, y)\end{align}\tag{2}$

\begin{matrix} (3) & \begin{aligned} h_{0 x} (x, y) & = {\begin{cases} 0 & if x = y = 0, \\ - \frac{ω_{c}^{2} x J_{2} (ω_{c} \sqrt{x^{2} + y^{2}})}{2 π (x^{2} + y^{2})} & otherwise, \end{cases} \\ h_{0 y} (x, y) & = h_{0 x} [y, x], \\ h_{1 x} (x, y) & = {\begin{cases} 0 & if x = y = 0, \\ \frac{\begin{array}{l} (ω_{c} x (3 y^{2} - x^{2}) (J_{0} (ω_{c} \sqrt{x^{2} + y^{2}}) ω_{c} \sqrt{x^{2} + y^{2}} (ω_{c}^{2} x^{2} + ω_{c}^{2} y^{2} - 24) \\ - 8 J_{1} (ω_{c} \sqrt{x^{2} + y^{2}}) (ω_{c}^{2} x^{2} + ω_{c}^{2} y^{2} - 6))) \end{array}}{2 π (x^{2} + y^{2})^{7 / 2}} & otherwise, \end{cases} \\ h_{1 y} (x, y) & = h_{1 x} [y, x], \end{aligned} \end{matrix}

$\begin{align}h_{0x}(x, y) &= \begin{cases}0&\text{if }x = y = 0,\\-\displaystyle\frac{\omega_c^2\,x\,J_2\left(\omega_c\sqrt{x^2 + y^2}\right)}{2 \pi\,(x^2 + y^2)}&\text{otherwise,}\end{cases}\\ h_{0y}(x, y) &= h_{0x}[y, x],\\ h_{1x}(x, y) &= \begin{cases}0&\text{if }x = y = 0,\\\frac{\begin{array}{l}\Big(ω_cx(3y^2 - x^2)\big(J_0\left(ω_c\sqrt{x^2 + y^2}\right)ω_c\sqrt{x^2 + y^2}(ω_c^2x^2 + ω_c^2y^2 - 24)\\ - 8J_1\left(ω_c\sqrt{x^2 + y^2}\right)(ω_c^2x^2 + ω_c^2y^2 - 6)\big)\Big)\end{array}}{2π(x^2 + y^2)^{7/2}}&\text{otherwise,}\end{cases}\\ h_{1y}(x, y) &= h_{1x}[y, x],\end{align}\tag{3}$

where $J_\alpha$ is a Bessel function of the first kind of order $\alpha$ and $\propto$ means "is proportional to". I used Wolfram Alpha queries ((ᵈ/dx)³; ᵈ/dx; ᵈ/dx(ᵈ/dy)²) to carry out differentiation, and simplified the result.

Truncated kernels in Python:

import matplotlib.pyplot as plt
import scipy
import scipy.special
import numpy as np

def h0x(x, y, omega_c):
  if x == 0 and y == 0:
    return 0
  return -omega_c**2*x*scipy.special.jv(2, omega_c*np.sqrt(x**2 + y**2))/(2*np.pi*(x**2 + y**2))

def h1x(x, y, omega_c):
  if x == 0 and y == 0:
    return 0
  return omega_c*x*(3*y**2 - x**2)*(scipy.special.j0(omega_c*np.sqrt(x**2 + y**2))*omega_c*np.sqrt(x**2 + y**2)*(omega_c**2*x**2 + omega_c**2*y**2 - 24) - 8*scipy.special.j1(omega_c*np.sqrt(x**2 + y**2))*(omega_c**2*x**2 + omega_c**2*y**2 - 6))/(2*np.pi*(x**2 + y**2)**(7/2))

def rotatedCosineWindow(N):  # N = horizontal size of the targeted kernel, also its vertical size, must be odd.
  return np.fromfunction(lambda y, x: np.maximum(np.cos(np.pi/2*np.sqrt(((x - (N - 1)/2)/((N - 1)/2 + 1))**2 + ((y - (N - 1)/2)/((N - 1)/2 + 1))**2)), 0), [N, N])

def circularLowpassKernel(omega_c, N):  # omega = cutoff frequency in radians (pi is max), N = horizontal size of the kernel, also its vertical size, must be odd.
  kernel = np.fromfunction(lambda x, y: omega_c*scipy.special.j1(omega_c*np.sqrt((x - (N - 1)/2)**2 + (y - (N - 1)/2)**2))/(2*np.pi*np.sqrt((x - (N - 1)/2)**2 + (y - (N - 1)/2)**2)), [N, N])
  kernel[(N - 1)//2, (N - 1)//2] = omega_c**2/(4*np.pi)
  return kernel

def prototype0x(omega_c, N):  # omega = cutoff frequency in radians (pi is max), N = horizontal size of the kernel, also its vertical size, must be odd.
  kernel = np.zeros([N, N])
  for y in range(N):
    for x in range(N):
      kernel[y, x] = h0x(x - (N - 1)/2, y - (N - 1)/2, omega_c)
  return kernel

def prototype0y(omega_c, N):  # omega = cutoff frequency in radians (pi is max), N = horizontal size of the kernel, also its vertical size, must be odd.
  return prototype0x(omega_c, N).transpose()

def prototype1x(omega_c, N):  # omega = cutoff frequency in radians (pi is max), N = horizontal size of the kernel, also its vertical size, must be odd.
  kernel = np.zeros([N, N])
  for y in range(N):
    for x in range(N):
      kernel[y, x] = h1x(x - (N - 1)/2, y - (N - 1)/2, omega_c)
  return kernel

def prototype1y(omega_c, N):  # omega = cutoff frequency in radians (pi is max), N = horizontal size of the kernel, also its vertical size, must be odd.
  return prototype1x(omega_c, N).transpose()

N = 321  # Horizontal size of the kernel, also its vertical size. Must be odd.
window = rotatedCosineWindow(N)

# Optional window function plot
#plt.imshow(window, vmin=-np.max(window), vmax=np.max(window), cmap='bwr')
#plt.colorbar()
#plt.show()

omega_c = np.pi/8  # Cutoff frequency in radians <= pi
lowpass = circularLowpassKernel(omega_c, N)
kernel0x = prototype0x(omega_c, N)
kernel0y = prototype0y(omega_c, N)
kernel1x = prototype1x(omega_c, N)
kernel1y = prototype1y(omega_c, N)

# Optional kernel image save
plt.imsave('lowpass.png', plt.cm.bwr(plt.Normalize(vmin=-lowpass.max(), vmax=lowpass.max())(lowpass)))
plt.imsave('kernel0x.png', plt.cm.bwr(plt.Normalize(vmin=-kernel0x.max(), vmax=kernel0x.max())(kernel0x)))
plt.imsave('kernel0y.png', plt.cm.bwr(plt.Normalize(vmin=-kernel0y.max(), vmax=kernel0y.max())(kernel0y)))
plt.imsave('kernel1x.png', plt.cm.bwr(plt.Normalize(vmin=-kernel1x.max(), vmax=kernel1x.max())(kernel1x)))
plt.imsave('kernel1y.png', plt.cm.bwr(plt.Normalize(vmin=-kernel1y.max(), vmax=kernel1y.max())(kernel1y)))
plt.imsave('kernelkey.png', plt.cm.bwr(np.repeat([(np.arange(321)/320)], 16, 0)))

Figure 1. Color-mapped 1:1 scale plot of circularly symmetric band-limiting filter impulse response, with cut-off frequency $\omega_c = \pi/8$ . Color key: blue: negative, white: zero, red: maximum.

Figure 2. Color-mapped 1:1 scale plots of sampled impulse responses of filters in the filter bank, with cut-off frequency $\omega_c = \pi/8$ , in order: $h_{0x}$ , $h_{0y}$ , $h_{1x}$ , $h_{0y}$ . Color key: blue: minimum, white: zero, red: maximum.

Directional edge detectors can be constructed as weighted sums of these. In Python (continued):

composite = kernel0x-4*kernel1x
plt.imsave('composite0.png', plt.cm.bwr(plt.Normalize(vmin=-composite.max(), vmax=composite.max())(composite)))
plt.imshow(composite, vmin=-np.max(composite), vmax=np.max(composite), cmap='bwr')
plt.colorbar()
plt.show()

composite = (kernel0x+kernel0y) + 4*(kernel1x+kernel1y)
plt.imsave('composite45.png', plt.cm.bwr(plt.Normalize(vmin=-composite.max(), vmax=composite.max())(composite)))
plt.imshow(composite, vmin=-np.max(composite), vmax=np.max(composite), cmap='bwr')
plt.colorbar()
plt.show()

Figure 3. Directional edge detection kernels constructed as weighted sums of kernels of Fig. 2. Color key: blue: minimum, white: zero, red: maximum.

The filters of Fig. 3 should be better tuned for continuous edges, compared to gradient filters (first two filters of Fig. 2).

Gaussian filters

The filters of Fig. 2 have a lot of oscillation due to strict band limiting. Perhaps a better staring point would be a Gaussian function, as in Gaussian derivative filters. Relatively, they are much easier to handle mathematically. Let's try that instead. We start with the impulse response definition of a Gaussian "low-pass" filter:

\begin{matrix} (4) & h (x, y, σ) = \frac{e^{- \frac{x^{2} + y^{2}}{2 σ^{2}}}}{2 π σ^{2}} . \end{matrix}

$h(x, y, \sigma) = \frac{e^{-\displaystyle\frac{x^2 + y^2}{2 \sigma^2}}}{2\pi \sigma^2}.\tag{4}$

We apply the operators of Eq. 2 to $h(x, y, \sigma)$ and normalize each filter $h_{..}$ by:

\begin{matrix} (5) & \int_{- \infty}^{\infty} \int_{- \infty}^{\infty} h_{. .} (x, y, σ)^{2} d x d y = 1. \end{matrix}

$\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}h_{..}(x, y, \sigma)^2\,dx\,dy = 1.\tag{5}$

\begin{matrix} (6) & \begin{aligned} h_{0 x} (x, y, σ) & = 2 \sqrt{2 π} σ^{2} \frac{d}{d x} h (x, y, σ) = - \frac{\sqrt{2}}{\sqrt{π} σ^{2}} x e^{- \frac{x^{2} + y^{2}}{2 σ^{2}}}, \\ h_{0 y} (x, y, σ) & = h_{0 x} (y, x, σ), \\ h_{1 x} (x, y, σ) & = \frac{2 \sqrt{3 π} σ^{4}}{3} ({(\frac{d}{d x})}^{3} - 3 \frac{d}{d x} {(\frac{d}{d y})}^{2}) h (x, y, σ) = - \frac{\sqrt{3}}{3 \sqrt{π} σ^{4}} (x^{3} - 3 x y^{2}) e^{- \frac{x^{2} + y^{2}}{2 σ^{2}}}, \\ h_{1 y} (x, y, σ) & = h_{1 x} (y, x, σ) . \end{aligned} \end{matrix}

$\begin{align}h_{0x}(x, y, \sigma) &= 2\sqrt{2\pi}σ^2 \frac{d}{dx}h(x, y, \sigma) = - \frac{\sqrt{2}}{\sqrt{\pi}σ^2} x e^{-\displaystyle\frac{x^2 + y^2}{2σ^2}},\\ h_{0y}(x, y, \sigma) &= h_{0x}(y, x, \sigma),\\ h_{1x}(x, y, \sigma) &= \frac{2\sqrt{3\pi}σ^4}{3}\left(\left(\frac{d}{dx}\right)^3-3\frac{d}{dx}\left(\frac{d}{dy}\right)^2\right)h(x, y, \sigma) = - \frac{\sqrt{3}}{3\sqrt{\pi}σ^4} (x^3 - 3xy^2) e^{-\displaystyle\frac{x^2 + y^2}{2σ^2}},\\ h_{1y}(x, y, \sigma) &= h_{1x}(y, x, \sigma).\end{align}\tag{6}$

We would like to construct from these, as their weighted sum, the impulse response of a vertical edge detector filter that maximizes specificity $S$ which is the mean sensitivity to a vertical edge over the possible edge shifts $s$ relative to the mean sensitivity over the possible edge rotation angles $\beta$ and possible edge shifts $s$ :

\begin{matrix} (7) & S = \frac{2 π \int_{- \infty}^{\infty} (\int_{- \infty}^{\infty} (\int_{- \infty}^{s} h_{x} (x, y, σ) d x - \int_{s}^{\infty} h_{x} (x, y, σ) d x) d y)^{2} d s}{(\int_{- π}^{π} \int_{- \infty}^{\infty} (\int_{- \infty}^{\infty} (\int_{- \infty}^{s} h_{x} (\cos (β) x - \sin (β) y, \sin (β) x + \cos (β) y) d x - \int_{s}^{\infty} h_{x} (\cos (β) x - \sin (β) y, \sin (β) x + \cos (β) y) d x) d y)^{2} d s d β)} . \end{matrix}

$S = \frac{2\pi\displaystyle\int_{-\infty}^{\infty}\Bigg(\int_{-\infty}^{\infty}\bigg(\int_{-\infty}^{s}h_x(x, y, \sigma)dx - \int_{s}^{\infty}h_x(x, y, \sigma)dx\bigg)dy\Bigg)^2ds} {\Bigg(\displaystyle\int_{-\pi}^{\pi}\int_{-\infty}^{\infty}\bigg(\int_{-\infty}^{\infty}\Big(\int_{-\infty}^{s}h_x\big(\cos(\beta)x- \sin(\beta)y, \sin(\beta)x + \cos(\beta)y\big)dx \\- \displaystyle\int_{s}^{\infty}h_x\big(\cos(\beta)x - \sin(\beta)y, \sin(\beta)x + \cos(\beta)y\big)dx\Big)dy\bigg)^2ds\,d\beta\Bigg)}.\tag{7}$

We only need a weighted sum of $h_{0x}$ with variance $\sigma^2$ and $h_{1x}$ with optimal variance. It turns out that $S$ is maximized by an impulse response:

\begin{matrix} (8) & \begin{aligned} h_{x} (x, y, σ) & = \frac{\sqrt{7625 - 2440 \sqrt{5}}}{61} h_{0 x} (x, y, σ) - \frac{2 \sqrt{610 \sqrt{5} - 976}}{61} h_{1 x} (x, y, \sqrt{5} σ) \\ = - \frac{\sqrt{(15250 - 4880 \sqrt{5}}}{61 \sqrt{π} σ^{2}} x e^{- \frac{x^{2} + y^{2}}{2 σ^{2}}} + \frac{\sqrt{1830 \sqrt{5} - 2928}}{4575 \sqrt{π} σ^{4}} (2 x^{3} - 6 x y^{2}) e^{- \frac{x^{2} + y^{2}}{10 σ^{2}}} \\ = \frac{2 \sqrt{π} σ^{2} \sqrt{15250 - 4880 \sqrt{5}}}{61} \frac{d}{d x} h (x, y, σ) - \frac{100 \sqrt{π} σ^{4} \sqrt{1830 \sqrt{5} - 2928}}{183} ({(\frac{d}{d x})}^{3} - 3 \frac{d}{d x} {(\frac{d}{d y})}^{2}) h (x, y, \sqrt{5} σ) \\ \approx 3.8275359956049814 σ^{2} \frac{d}{d x} h (x, y, σ) - 33.044650082417731 σ^{4} ({(\frac{d}{d x})}^{3} - 3 \frac{d}{d x} {(\frac{d}{d y})}^{2}) h (x, y, \sqrt{5} σ), \end{aligned} \end{matrix}

$\begin{align}h_x(x, y, \sigma) &= \frac{\sqrt{7625 - 2440\sqrt{5}}}{61} h_{0x}(x, y, \sigma) - \frac{2\sqrt{610\sqrt{5} - 976}}{61} h_{1x}(x, y, \sqrt{5}\sigma)\\ &= - \frac{\sqrt{(15250 - 4880\sqrt{5}}}{61\sqrt{\pi}σ^2}xe^{-\displaystyle\frac{x^2 + y^2}{2σ^2}} + \frac{\sqrt{1830\sqrt{5} - 2928}}{4575 \sqrt{\pi} σ^4}(2x^3 - 6xy^2)e^{-\displaystyle\frac{x^2 + y^2}{10 σ^2}}\\ &= \frac{2\sqrt{\pi}σ^2\sqrt{15250 - 4880\sqrt{5}}}{61}\frac{d}{dx}h(x, y, \sigma) - \frac{100\sqrt{\pi}σ^4\sqrt{1830\sqrt{5} - 2928}}{183}\left(\left(\frac{d}{dx}\right)^3-3\frac{d}{dx}\left(\frac{d}{dy}\right)^2\right)h(x, y,\sqrt{5}\sigma)\\ &\approx 3.8275359956049814\,\sigma^2\frac{d}{dx}h(x, y, \sigma) - 33.044650082417731\,\sigma^4\left(\left(\frac{d}{dx}\right)^3-3\frac{d}{dx}\left(\frac{d}{dy}\right)^2\right)h(x, y,\sqrt{5}\sigma),\end{align}\tag{8}$

also normalized by Eq. 5. To vertical edges, this filter has a specificity of $S = \frac{10\times5^{1/4}}{9}$ $+$ $2$ $\approx$ $3.661498645$ , in contrast to the specificity $S = 2$ of a first-order Gaussian derivative filter with respect to $x$ . The last part of Eq. 8 has normalization compatible with separable 2-d Gaussian derivative filters from Python's scipy.ndimage.gaussian_filter:

import matplotlib.pyplot as plt
import numpy as np
import scipy.ndimage

sig = 8;
N = 161
x = np.zeros([N, N])
x[N//2, N//2] = 1
ddx = scipy.ndimage.gaussian_filter(x, sigma=[sig, sig], order=[0, 1], truncate=(N//2)/sig)
ddx3 = scipy.ndimage.gaussian_filter(x, sigma=[np.sqrt(5)*sig, np.sqrt(5)*sig], order=[0, 3], truncate=(N//2)/(np.sqrt(5)*sig))
ddxddy2 = scipy.ndimage.gaussian_filter(x, sigma=[np.sqrt(5)*sig, np.sqrt(5)*sig], order=[2, 1], truncate=(N//2)/(np.sqrt(5)*sig))

hx = 3.8275359956049814*sig**2*ddx - 33.044650082417731*sig**4*(ddx3 - 3*ddxddy2)
plt.imsave('hx.png', plt.cm.bwr(plt.Normalize(vmin=-hx.max(), vmax=hx.max())(hx)))

h = scipy.ndimage.gaussian_filter(x, sigma=[sig, sig], order=[0, 0], truncate=(N//2)/sig)
plt.imsave('h.png', plt.cm.bwr(plt.Normalize(vmin=-h.max(), vmax=h.max())(h)))
h1x = scipy.ndimage.gaussian_filter(x, sigma=[sig, sig], order=[0, 3], truncate=(N//2)/sig) - 3*scipy.ndimage.gaussian_filter(x, sigma=[sig, sig], order=[2, 1], truncate=(N//2)/sig)
plt.imsave('ddx.png', plt.cm.bwr(plt.Normalize(vmin=-ddx.max(), vmax=ddx.max())(ddx)))
plt.imsave('h1x.png', plt.cm.bwr(plt.Normalize(vmin=-h1x.max(), vmax=h1x.max())(h1x)))
plt.imsave('gaussiankey.png', plt.cm.bwr(np.repeat([(np.arange(161)/160)], 16, 0)))

Figure 4. Color-mapped 1:1 scale plots of, in order: A 2-d Gaussian function, derivative of the Gaussian function with respect to $x$ , a differential operator $\big(\frac{d}{dx}\big)^3-3\frac{d}{dx}\big(\frac{d}{dy}\big)^2$ applied to the Gaussian function, the optimal two-component Gaussian-derived vertical edge detection filter $h_x(x, y, \sigma)$ of Eq. 8. The standard deviation of each Gaussian was $\sigma = 8$ except for the hexagonal component in the last plot which had standard deviation $\sqrt{5}\times8$ . Color key: blue: minimum, white: zero, red: maximum.

TO BE CONTINUED...

— Olli Niemitalo
quelle