Feature #1587
Discard and redude the scores of KPs with apostrophe when the header variant does not contain it
Start date:
09/01/2021
Due date:
% Done:
100%
Estimated time:
2.50 h
Description
For this task, we need to make changes in the mapping_phrase_docid where we can identify the KP which contains apostrophe but the header variant does not. They generally contribute to bad KP cases e.g.
1. item 's option -> item options
2. file 's format -> file formats
3. distribution 's version -> distribution versions
4. module 's api -> module api
5. module 's attributes -> module attributes
6. module 's contents -> module contents
7. option 's action -> option actions
These cases are generally tagged out of context and hence, they should be penalized. The penalty can be 0.1 to ensure that they are scored on the lower side of the sim_score band.
Testing to be done with Library Reference.txt & Whirlwind.txt