knarrow package¶
Subpackages¶
Submodules¶
knarrow.angle_method module¶
- knarrow.angle_method.angle(x, y, **kwargs)¶
Find a knee by looking at the maximum change of the angle of the line going through consecutive point pairs.
Quite sensitive to noise, use with cubic spline smoothing.
- Parameters:
x (
np.ndarray) – the \(x\) coordinates of the pointsy (
np.ndarray) – the \(y\) coordinates of the points**kwargs – possible additional arguments (none are used)
- Returns:
the index of the knee
- Return type:
int
knarrow.c_method module¶
- knarrow.c_method.c_method(x, y, **kwargs)¶
Implements the C-method as described in https://blog.msmetko.xyz/posts/2
- Parameters:
x (
np.ndarray) – the ground truth \(x\) coordinatesy (
np.ndarray) – the ground truth \(y\) coordinates**kwargs –
dict, optional arguments (should be empty)
- Returns:
the index of the knee
- Return type:
int
- knarrow.c_method.d2e_dc2(y, x, c)¶
The second derivative of the energy function \(E(x)\) with respect to \(c\).
Used in
newton_raphsonmethod for the optimization of the shape parameter \(c\).- Parameters:
y (
np.ndarray) – the ground truth function values we wish to fit the knee curve \(f(x)\) onx (
np.ndarray) – the \(x\) coordinates of the pointsc (
float) – the shape parameter
- Returns:
values of the second derivative of \(E(x)\) evaluated at \(x\)
- Return type:
np.ndarray
- knarrow.c_method.d2f_dc2(x, c)¶
The second derivative of the knee curve \(f(x)\) with respect to \(x\), conditioned on \(c\).
- Parameters:
x (
np.ndarray) – the \(x\) coordinates of the pointsc (
float) – the shape parameter
- Returns:
values of the second derivative of \(f(x)\) evaluated at \(x\)
- Return type:
np.ndarray
- knarrow.c_method.de_dc(y, x, c)¶
The first derivative of the energy function \(E(x)\) with respect to \(c\).
Used in
newton_raphsonmethod for the optimization of the shape parameter \(c\).- Parameters:
y (
np.ndarray) – the ground truth function values we wish to fit the knee curve \(f(x)\) onx (
np.ndarray) – the \(x\) coordinates of the pointsc (
float) – the shape parameter
- Returns:
values of the first derivative of \(E(x)\) evaluated at \(x\)
- Return type:
np.ndarray
- knarrow.c_method.df_dc(x, c)¶
The first derivative of the knee curve \(f(x)\) with respect to \(x\), conditioned on \(c\).
- Parameters:
x (
np.ndarray) – the \(x\) coordinates of the pointsc (
float) – the shape parameter
- Returns:
values of the slopes of tangents on \(f(x)\) evaluated at \(x\)
- Return type:
np.ndarray
- knarrow.c_method.f(x, c)¶
The knee curve.
This function is being fitted to the input data by tweaking the c parameter.
- Parameters:
x (
np.ndarray) – the \(x\) coordinates of the pointsc (
float) – the shape parameter
- Returns:
the values of the function evaluated at \(x\) with respect to. c
- Return type:
np.ndarray
- knarrow.c_method.get_knee(c)¶
Calculates the position of the point of maximal curvature.
The position depends solely on the shape parameter \(c\).
- Parameters:
c (
float) – the shape parameter- Returns:
the \(x\) position where the knee curve \(f(x)\) is most curved
- Return type:
float
- knarrow.c_method.newton_raphson(x, y)¶
The implementation of the Newton-Raphson optimization procedure. It fits the knee curve \(f(x)\) to the \(y\) s of the corresponding \(x\) s by tweaking the shape parameter \(c\) from an initial guess.
- Parameters:
x (
np.ndarray) – the ground truth \(x\) coordinatesy (
np.ndarray) – the ground truth \(y\) coordinates
- Returns:
the optimal shape parameter \(c\) which minimizes the squared error, up to a predefined tolerance level
- Return type:
float
knarrow.distance_method module¶
- knarrow.distance_method.distance(x, y, **kwargs)¶
Find a knee by finding a point which is most distant from the line \(y=x\) (after normalizing the inputs)
Fun fact: vertical distance of a point P from line \(y=x\) is just a scaled version of the orthogonal distance of the same point P from line \(y=x\).
- Parameters:
x (
np.ndarray) – the \(x\) coordinates of the pointsy (
np.ndarray) – the \(y\) coordinates of the points**kwargs – possible additional arguments (none are actually used)
- Returns:
the index of the knee
- Return type:
int
- knarrow.distance_method.distance_adjacent(x, y, **kwargs)¶
Find a knee by finding a point which is most distant from the line going through the neighbouring points.
This method is quite sensitive to noise, so only use with cubic spline smoothing.
Note: I developed a (somewhat) fancy linear algebra implementation so this should be quite fast.
- Parameters:
x (
np.ndarray) – the \(x\) coordinates of the pointsy (
np.ndarray) – the \(y\) coordinates of the points**kwargs – possible additional arguments (none are actually used)
- Returns:
the index of the knee
- Return type:
int
knarrow.kneedle module¶
- knarrow.kneedle.kneedle(x, y, **kwargs)¶
Kneedle method from https://doi.org/10.1109/ICDCSW.2011.20
- Parameters:
x (
np.ndarray) – the \(x\) coordinates of the pointsy (
np.ndarray) – the \(y\) coordinates of the points**kwargs
- Returns:
the index of the knee
- Return type:
int
knarrow.main module¶
knarrow.menger module¶
- knarrow.menger.double_triangle_area(vertices)¶
Return twice the area of a triangle with the given vertices.
- Parameters:
vertices (
np.ndarray) – \(3 \times 2\) matrix where every row is a new 2-D vertex \((x, y)\)- Returns:
\(1 \times 1\) array, twice the area
- Return type:
np.ndarray
- knarrow.menger.get_curvature(vertices)¶
Calculate the Menger curvature defined by the three points
- Parameters:
vertices (
np.ndarray) – \(3 \times 2\) matrix where every row is a new 2-D vertex \((x, y)\)- Returns:
the curvature, i.e. the reciprocal of the radius of the circumcircle around the vertices
- Return type:
np.ndarray
- knarrow.menger.get_squared_vector_lengths(vertices)¶
Return the square of the lengths between neighbouring vertices
- Parameters:
vertices (
np.ndarray) – array of vertices- Returns:
lenghts; the entry at
lenghts[i]is a squared distance between the vertexiandi+1- Return type:
np.ndarray
- knarrow.menger.menger_anchored(x, y, **kwargs)¶
Find a knee using the Menger curvature on the first point, last point, and varying the middle point. More resistant to the noise in the data than menger_successive
- Parameters:
x (
np.ndarray) – the \(x\) coordinates of the pointsy (
np.ndarray) – the \(y\) coordinates of the points**kwargs – possible additional arguments (none are used)
Returns –
int: the index of the knee
- knarrow.menger.menger_successive(x, y, **kwargs)¶
Find a knee using the Menger curvature on the three successive points
- Parameters:
x (
np.ndarray) – the \(x\) coordinates of the pointsy (
np.ndarray) – the \(y\) coordinates of the points**kwargs – possible additional arguments (none are used)
- Returns:
the index of the knee
- Return type:
int
knarrow.ols module¶
- knarrow.ols.ols_swiping(x, y, **kwargs)¶
Performs OLS swiping method.
By first selecting a pivot point, all the data points are implicitly split into two parts: the left one and the right one. Then, the algorithm regresses a line onto \(y\) with respect to \(x\) for both parts using ordinary least squares. If both lines fit quite well, meaning the \(R^2\) for both left and the right fit is particularly high, the pivot point is declared as a knee.
- Parameters:
x (
np.ndarray) – the \(x\) coordinates of the pointsy (
np.ndarray) – the \(y\) coordinates of the points**kwargs – possible additional arguments (none are used)
- Returns:
the index of the knee
- Return type:
int
- knarrow.ols.r_squared(x, y)¶
Fit a linear regression to \(y\) and return the \(R^2\) statistical measure of a fit of a linear regression
- Parameters:
x (
np.ndarray) – The \(x\) coordinates of the pointsy (
np.ndarray) – The \(y\) coordinates of the points
- Returns:
The \(R^2\) measure of a fit. Bounded by \(\left[0, 1\right]\). The closer \(R^2\) is to \(1\), the better the fit.
- Return type:
float
knarrow.util module¶
- class knarrow.util.KneeType(*values)¶
Bases:
Enum- DECREASING_CONCAVE = 2¶
- DECREASING_CONVEX = 0¶
- INCREASING_CONCAVE = 1¶
- INCREASING_CONVEX = 3¶
- knarrow.util.cubic_spline_smoothing(x, y, smoothing_factor=0)¶
Smoothes the \(y\) vecetor by minimizing the second derivative.
- Parameters:
x (
np.ndarray) – the \(x\) coordinates of the pointsy (
np.ndarray) – the \(y\) coordinates of the pointssmoothing_factor (
float) – the cubic spline smoothing hyperparameter
- Returns:
the \(x\) and \(y\) coordinates of the smoothed points
- Return type:
tupleofnp.ndarray
- knarrow.util.detect_knee_type(y1, y2, y3, y4)¶
Detects the type of a knee using the first two and the last two points.
Simplifies the problem by assuming noiseless input.
- Parameters:
y1 (
float) – the \(y\) coordinate of the first pointy2 – (
float): the \(y\) coordinate of the second pointy3 – (
float): the \(y\) coordinate of the second-to-last pointy4 – (
float): the \(y\) coordinate of the last point
- Returns:
the type of the knee detected
- Return type:
- knarrow.util.get_delta_matrix(h)¶
Creates \(\Delta\) matrix. Used for cubic spline smoothing.
Adapted from https://en.wikipedia.org/wiki/Smoothing_spline#Derivation_of_the_cubic_smoothing_spline
[h1, h2, h3, h4] -> [[1/h1 -1/h1-1/h2 1/h2 0 0] [ 0 1/h2 -1/h2-1/h3 0 0] [ 0 0 1/h3 -1/h3-1/h4 1/h4]]
- Parameters:
h (
np.ndarray) – the differences vector- Returns:
the \(\Delta\) matrix
- Return type:
np.ndarray
- knarrow.util.get_weight_matrix(h)¶
Creates the weight matrix. Used for cubic spline smoothing.
Adapted from https://en.wikipedia.org/wiki/Smoothing_spline#Derivation_of_the_cubic_smoothing_spline
- Parameters:
h (
np.ndarray) – the differences vector- Returns:
the weight matrix \(W\)
- Return type:
np.ndarray
- knarrow.util.normalize(x)¶
Helper function for normalizing the inputs.
Normalization is an affine transformation such that the minimal element of x maps to 0, and maximal element of x maps to 1
- Parameters:
x (
np.ndarray) – the array to be normalized- Returns:
a normalized array such that the minimum is 0 and the maximum is 1
- Return type:
np.ndarray
- knarrow.util.np_anchored(length: int) ndarray[tuple[Any, ...], dtype[int64]]¶
Return indices x such that every row in array[x] is a slice of the array with a first, i-th and the last element Helper function which uses numpy broadcasting to work
- Parameters:
length (
int) – the total length of the array to be indexed in the anchored way- Returns:
indices for windowing an array
- Return type:
np.ndarray
- knarrow.util.np_windowed(length: int, window_size: int, stride: int = 1, dilation: int = 1) ndarray¶
Return indices x such that every row in array[x] is a windowed slice of the array Helper function which uses numpy broadcasting to work Adapted from: https://stackoverflow.com/a/42258242
- Parameters:
length (
int) – the total length of the array to be indexedwindow_size (
int) – the size of the window to be slided across the arraystride (
int) – the size of the “jumps” between the consecutive windows. Defaults to 1dilation (
int) – the distance between the elements in the window. Defaults to 1
- Returns:
indices for windowing an array
- Return type:
np.ndarray
- knarrow.util.prepare(f)¶
- knarrow.util.projection_distance(vertices)¶
Return the projection distance of the point P1 to the line through both P2 and the origin.
- Parameters:
vertices (
np.ndarray) – array of shape(..., 2, 2). Coordinates of the points such thatvertices[..., 0, :]are the coordinates of the point we wish to project on a line defined with the origin andvertices[..., 1, :]- Returns:
one dimensional array denoting the distance the point
vertices[..., 0, :]must travel to be projected onto the line defined byvertices[..., 1, :]and the origin- Return type:
np.ndarray
Module contents¶
- knarrow.all(*args, **kwargs)¶
- knarrow.find_knee(*args, **kwargs)¶