Task #1898
Updated by Nandini Bansal about 3 years ago
The repeating KPs within 50 words should be skipped as per the new scheme: If multiple instances of the same KP are tagged within the threshold (50 words) like k1, k2, k3 (same order) Case 5) if sk1 = sk2 = sk3, all k1, k2, k3 will be tagged were previously if they are different words originally but getting tagged because their due to the same lemmatized forms are the same. forms. i.e. if the true/original form of KP k1, k2, k3 are the same, we shall not tag all three but k1 only A hypothetical example to explain this would be, say k1 "allocators", k2 "allocation", k3 "allocations" and sk1 = sk2 = sk3 Then k1 & k2 will be tagged k1 & (k2, k3) are originally different but getting tagged due to the same lemmatized forms Note: skn = sim_score of kn Changes to be done in "check_repetition_multi_occurence" function