Project

General

Profile

Bug #1744

Task #1726: Handling cases of bad header variants like "representation"

Bug #1743: Checking singular and plural forms of the tmp_var from variations_in_common_section_words in common words list

Calculating the fullness_ratio of the header variants to decide a threshold for removal of header variants

Added by Nandini Bansal about 3 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
Start date:
10/13/2021
Due date:
% Done:

0%

Estimated time:
2.00 h

Description

Using the original headers, calculate the fullness_ratio of the header variants wrt the original headers to see if we can find any underlying pattern to remove bad KPs.
E.g.
original header: at object creation time, using keyword arguments
tmp_var: keyword
fullness_ratio is 0.14285

Test this idea for Lib Ref book, C-API book, Whirlwind book & Lang Ref book.

Also available in: Atom PDF