ios - Swift Trie levenshtein distance search -
i've built trie data structure looks this:
struct trie<element : hashable> : equatable { private var children: [element: trie<element>] private var endhere: bool }
to perform autocorrection operations on input uitextfield
. gave trie variety of functions such insert:
/** private insert function. inserts elements trie using sequences' generator. - parameter g: `generatortype`. */ private mutating func insert<g: generatortype g.element == element>(g: g) { var gen = g if let head = gen.next() { if case nil = children[head]?.insert(gen) { children[head] = trie(g: gen) } } else { endhere = true } } /** insert elements trie. - parameter seq: sequence of elements. */ mutating func insert<s: sequencetype s.generator.element == element>(seq: s) { insert(seq.generate()) }
the necessary initializers:
/** create empty trie. */ init() { children = [:] endhere = false } /** initialize trie generator. - parameter g: `generatortype`. */ private init<g: generatortype g.element == element>(g: g) { var gen = g if let head = gen.next() { (children, endhere) = ([head:trie(g: gen)], false) } else { (children, endhere) = ([:], true) } } /** construct arbitrary sequence of sequences elements of type `element`. - parameter s: sequence of sequences. */ init<s: sequencetype, inner: sequencetype s.generator.element == inner, inner.generator.element == element>(_ s: s) { self.init() s.foreach { insert($0) } } /** construct trie sequence of elements. - parameter s: sequence. */ init <s: sequencetype s.generator.element == element>(_ s: s) { self.init(g: s.generate()) }
and conformed trie
sequencetype
can iterate through elements.
now, want implement levenshtein distance search search function like:
func search<s: sequencetype s.generator.element == element(s: s, maxdistance: int = 0) -> [(s, int)] { }
where return value list of matched subsequences found , max distance away original query sequence knowledge bit lacking. i'm not sure how perform search on trie , build list of matched sequences while calculating insertion, deletion, , replacement cost.
the solution nontrivial, take @ paper, fast string correction levenshtein-automata. treat trie dictionary automaton, intersected levenshtein automaton. search strategy used follow paths along intersection lead terms levenshtein distances (from query term) no greater specified threshold.
as reference, liblevenshtein has implementation in java. logic pertaining searching trie, in src/main/java/com/github/liblevenshtein/transducer.