Python Scipy Read in Data and Find Peaks
Finding (real) peaks in your signal with SciPy and some common-sense tips
If you are a data scientist, 1 of your typical task is to clarify a certain signal and find its peaks. This may exist your goal for a ton of reasons, just the bigger 1 is that peaks somehow tend to prove a sure property of your signal. It is not surprising that a library that helps you lot finding peaks does already exist in Python, and information technology is SciPy (with its find_peaks part). Nonetheless it is important to use this powerful library wisely to get exactly what we want and to do and then we need to brand our definition of "peak" more than explicit.
What are you looking for?
In the case I'one thousand going to evidence a superlative is a star :)
In the astrophysics domain, the Hubble telescope furnishes the image file information of the Tucanae star . It looks like this:
Mathematically speaking, it is just a matrix of N rows and M columns. Each number of the matrix represents the intensity of the pixel. As we would like to take a star to be as airtight as possible to a pixel we would like to:
A) If two peaks are as well shut to each other we may want to consider them as one single top
B) Non to have higher values of intensity besides close to the peak we are considering (i.e. if our peak is close to one thousand, we don't want to accept a 2000 peak close to this one)
C) Nosotros may want that peak to decay to a sure value (0 in this example) in both ends of the range nosotros are studying
As it is possible to encounter, at that place is no astrophysics in these condition, as they are common sense consideration nearly what it actually is a "peak". They may also sound trivial but it is of import to consider them in our code. Speaking of, let's start with the Python talk!
Import your wizards and name your elves.
In this specific instance, y'all may use this setup:
import astropy
from astropy.io import fits
import pandas every bit pd
import matplotlib.pyplot as plt
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
from matplotlib import cm
import numpy every bit np
import math
import seaborn every bit sns
from scipy.misc import electrocardiogram
from scipy.signal import find_peaks
hdu_list= fits.open('imagefilepath')
img_data=hdu_list[0].data
Employ find_peaks
Let's say nosotros are interested in detecting the 1000 meridian elevation. Find peaks is a powerful tool to practise that, just it does include the A, B and C evil peaks. Anyway it is a skilful start. Fix a sure Y on your image and discover your peaks with find_peaks. Generally you may desire to do that for all the Y-es.
HEIGHTS_fixed_y=[]
X_HEIGHTS=[]
Y=[]
for y in range(len(img_data)):
x = img_data[y]
peaks, _= find_peaks(x,height=(950, 1050))
if len(peaks)!=0:
for i in range(len(peaks)):
HEIGHTS_fixed_y.suspend(x[peaks].tolist()[i])
X_HEIGHTS.append(peaks.tolist()[i])
Y.append(y)
#X_HEIGHTS save all the x coordinates of the peaks
#HEIGHTS_FIXED_y save all the heights of the peaks (they will be between 950 and 1050)
#Y salve all the y coordinates of the peaks thousand=pd.DataFrame()
one thousand['Heights']=HEIGHTS_fixed_y
grand['$X_{heights}$']=X_HEIGHTS
m['Y']=Y
A) Deleting the peaks that are too close to each other
Let's say Y=400. You lot may have that at this coordinate, X=20 is a peak and 10=9800 is a peak likewise, and that is ok. But if X=20 is a peak and X=24 is a peak besides, then nosotros are non doing things right! Let'southward say that the Xes for a stock-still Y has to be separated past at to the lowest degree 10 pixels, and delete them if information technology is not the case.
y_unique_values=list(gear up(thousand.Y.tolist()))
wrong_y=[]
for y in y_unique_values:
wrong_x=thousand[thousand['Y']==y]['$X_{heights}$'].tolist()
differences=[]
if len(wrong_x)!=one: for x in range(len(wrong_x)-1):
first_value=wrong_x[x]
for j in range(ten+1,len(wrong_x)):
divergence=abs(first_value-wrong_x[j])
differences.append(deviation)
for d in differences:
if d<=10:
wrong_y.suspend(y)
for y in wrong_y:
thousand=k[yard.Y!=y]
k=chiliad.reset_index().drop(columns=['alphabetize'])
B) It is not a meridian if in that location is something higher
If yous desire to tiptop elevation to be shut to 1000, you tin't take that peaks with top shut to 2000 are too close to the x of your peak. Get rid of them with this code:
not_peak_index=[]
for i in range(len(thousand)):
y_values=img_data[g.Y.tolist()[i]].tolist()
x_min,x_max=thousand['$X_{heights}$'].tolist()[i]-10,thousand['$X_{heights}$'].tolist()[i]+10
range_values=y_values[x_min:x_max]
binary=0
for value in range_values:
if value > thousand['Heights'].tolist()[i]:
binary=binary+ane
if binary!=0:
not_peak_index.suspend(i)
C) Ok meridian, you had your moment, now get back to sleep.
You may want that your peak decays to 0 at both ends of your range (i.e. a situation like the one below is not acceptable)
Discover the peaks that actually go to 0 with this code:
not_zeros_index=[]
for i in range(len(thousand)):
y_values=img_data[grand.Y.tolist()[i]].tolist()
x_min,x_max=chiliad['$X_{heights}$'].tolist()[i]-10,thousand['$X_{heights}$'].tolist()[i]+10
if abs(y_values[x_min])>10 and abs(y_values[x_max]>ten):
not_zeros_index.suspend(i)
Nosotros set 10 because it is an acceptable compromise not to delete a larger gear up of points, but of course you are the dominate!
Conclusions:
SciPy (and Python libraries in general) is a really powerful library, simply it needs to be used wisely as it tin can not know what we want if we don't specify it. In this article 3 tips are given in lodge to clean your find_peaks resulting dataframe from unwanted peaks simply of form, you may want to develop some more conditions on your urgencies :)
If you liked the article and y'all desire to know more about Machine Learning, or you simply desire to inquire me something you tin:
A. Follow me on Linkedin, where I publish all my stories
B. Subscribe to my newsletter. It will go on yous updated well-nigh new stories and requite yous the chance to text me to receive all the corrections or doubts you may take.
C. Become a referred member, then yous won't accept any "maximum number of stories for the month" and you can read whatever I (and thousands of other Automobile Learning and Information Scientific discipline top writer) write about the newest technology available.
Ciao :)
Source: https://medium.com/analytics-vidhya/finding-real-peaks-in-your-signal-with-scipy-and-some-common-sense-tips-a62c55cd387d
Post a Comment for "Python Scipy Read in Data and Find Peaks"