Bug #1744
Task #1726: Handling cases of bad header variants like "representation"
Bug #1743: Checking singular and plural forms of the tmp_var from variations_in_common_section_words in common words list
Calculating the fullness_ratio of the header variants to decide a threshold for removal of header variants
Start date:
10/13/2021
Due date:
% Done:
0%
Estimated time:
2.00 h
Description
Using the original headers, calculate the fullness_ratio of the header variants wrt the original headers to see if we can find any underlying pattern to remove bad KPs.
E.g.
original header: at object creation time, using keyword arguments
tmp_var: keyword
fullness_ratio is 0.14285
Test this idea for Lib Ref book, C-API book, Whirlwind book & Lang Ref book.