edit_Distance

class edit_Distance

This class provides the functionality for suggestion of simliar words or nearest smilar word based on edit distance algorithm.

Public Functions

QVector<QString> editDistance(QString, QString)

This function takes two strings as argument then calculates the edit distance of both strings ie.

minimum number of operation required to convert string first to string second then it returns the converted string and also it uses heuristics way to limit the searches.

Parameters:
  • a

  • b

Returns:

editDistance between two strings

int min(int, int)

This function compares a and b an returns the smaller one.

Parameters:
  • a

  • b

Returns:

Minimum of a and b

QVector<QString> phrase_heuristics(QStringList, QStringList)

This functions is used to eficiently retrieve the vocabulary terms likely to have low edit distance to query items by restricting searches and then returning the optimal path to convert string first to string second.

Parameters:
  • s1

  • s2

Returns:

optimalPath

void backtrace(QStringList, QStringList, int**)

This function helps edit distance algorith by pointing to the previous cell which was used in calculation of the cost to convert string first to string second.

Parameters:
  • s1

  • s2

  • solution

int getEditDistance(std::string first, std::string second)

This function takes two strings as argument then calculates the edit distance of both strings ie.

minimum number of operation required to convert string first to string second.

Parameters:
  • first

  • second

Returns:

T[m][n]

double findStringSimilarity(std::string first, std::string second)

This function takes two strings as argument then calculates the similarity between them.

Parameters:
  • first

  • second

Returns:

double

int getSimilarityValue(std::string str1, std::string str2)

Implementation of Ratcliff/Obershelp pattern-matching algorithm.

It returns the similarity index of two strings i.e., how similar or dissimilar two strings are. Also, it is a sequence based algorithm https://itnext.io/string-similarity-the-basic-know-your-algorithms-guide-3de3d7346227

Parameters:
  • str1

  • str2

Returns:

Similarity index of two strings

int matchPattern(std::string str1, int arLengthLeft, std::string str2, int arLengthRight)

Returns the match result between two strings.

Parameters:
  • str1

  • arLengthLeft

  • str2

  • arLengthRight

Returns:

Match result

double DiceMatch(std::string string1, std::string string2)

Implementation of Sorensen-Dice algorithm.

It calculates similarity between two strings. It is a token based algorithm

Parameters:
  • string1

  • string2

Returns:

Dice match result between two strings