P1 open P1 34% 151 issues (42 closed — 109 open) Related issues Feature #1557: Ensure some documents do not show up in purple link results Actions Feature #1587: Discard and redude the scores of KPs with apostrophe when the header variant does not contain it Actions Feature #1593: Eliminate certain header variants generated from variations_in_common_section_words Actions Feature #1598: Remove return datatype from headers with empty parenthesis Actions Feature #1615: Add the str_between of variation_middle_parenthesis to processed_full_header list Actions Feature #1708: For KPs located very closely, pick the one which is most similar & add a wrapper for lemmatisation to handle some exception cases Actions Feature #1766: Refactoring and updating the logic of update_tags function to make universal use in mainKP and subKP tagging Actions Feature #1805: Remove KPs starting with numbers in words Actions Feature #1807: Add a functionality in the ADR to log the KPs where sim_scores are changing Actions Feature #1809: Checking the context of the KP matched with "word1.word2" header variant Actions Feature #1810: Implement the penalty algorithm for KPs matching with "word1.word2" header variants in update_similarity_with_context function Actions Feature #1811: For matching "word1" in with the KP, make use of token_processing function Actions Feature #1812: Analysis for improvising the algorithms Actions Feature #1813: Analysis for improvising the algorithms Actions Feature #1815: Implement a new context algorithm for the KPs matching with <word1.word2> subsection headers Actions Feature #1817: Update the similar document score string for KPs matching with different header variants Actions Feature #1856: Penalise the multi-word KP based on "Past" tense & vector similarity score Actions Feature #1892: In "set_subtract_const_h" method, add penalty for headers with all words within CW 500 unstemmed Actions Feature #1952: Ignore some KPs in the entire book based on the book name Actions Feature #2026: Can 'function' be a darkish purple link to 'Using and Defining Functions'? Actions Feature #2029: Do not display purple link results that are much lesser quality than some others Actions Feature #2129: Task 1: Creating KBs Actions Feature #2130: Task 1.1.1: Creating KBs - Step 1 & 2 Actions Feature #2131: Task 1.1.2: Creating KBs - Step 3 - Save the KBs Actions Feature #2132: Task 1.1: Tag KPs using only 1 KB only Actions Feature #2133: Task 1.2. Tagging of KPs using multiple KBs Actions Feature #2134: Task 1.1.1.1: Add argument & create new calling function Actions Feature #2135: Task 1.1.1.2: Modify the tag_BR3_IR3 function for new scheme Actions Feature #2136: Task 1.1.2.1. Linking google drive and creating folder inside it using python script Actions Feature #2137: Task 1.1.2.2. Merge the lists to save pickle file Actions Feature #2138: Task 1.1.2.3. Save remaining variables in pickle/json file format Actions Feature #2234: Pick right parts of "KP sentence" and sentences before and after Actions Feature #2338: KPs like iterator and iterable need to be considered different Actions Feature #2356: Extract important substrings using exclude_headers_list Actions Feature #2585: Penalty to be proportional to the number of nouns in the KP Sentence and the simdoc's section hierarchy match_sent_head_nouns Actions Feature #2977: Support annotation amongst multiple books Actions Task #1560: Finding different POS tags that can be stripped from the beginning/ending of the keyphrase to result in better KPs Actions Task #1561: Modification in tagging_utils.py such that doc_ids with sim_score equal to the kp_doc_id are not removed Actions Task #1643: Modification in update_doc_id_score_list from tagging_utils.py such that for doc_ids with scores same as self links are reduced by 0.05 Actions Task #1653: Adding all the header variants generated by variation_middle_parenthesis to processed_full_header with fullness_ratio 1.0 Actions Task #1657: Modifications in remove_function_signature method to handle some new cases Actions Task #1663: Adding new header variants from pos_nouns function with fullness_ratio 1.0 Actions Task #1711: Establish similarity between 'a b' and 'c-or-a b' (also 'c b' and 'c-or-a b') Actions Task #1726: Handling cases of bad header variants like "representation" Actions Task #1857: Extending the remove_header_by_adjective method to improvise the quality of KPs Actions Task #1858: Extending the variation_slash function in BR3_IR3_tagger.py Actions Task #1862: Finding the reason why "initialization" header variants is giving high confidence KPs Actions Task #1863: Finalize scheme for removing "etc" header variant Actions Task #1864: Finalize scheme for removing "unaffected" header variant Actions Task #1895: Drop low confidence KPs if it is a substring/subKP of a nearby tagged KP with high confidence Actions Task #1897: Skip the repeating KPs within 50 words according to new scheme Actions Task #1898: Skip the repeating KPs within 50 words according to new scheme - Contd Actions Task #1899: Tag all the skipped KPs with special temporary tags Actions Task #1900: Un-tag the temporary tags added while skipping the KPs Actions Task #1907: Make changes in variations_in_common_section_words to make sure "Ipython interpreter" is linked to "interpreter" KP Actions Task #1921: Exclude both plural and singular forms of the words from Common Word list Actions Task #1927: Find the reason why "unsigned integers", "signed integers" & "floating point" is light red i.e. low conf Actions Task #1961: Process the header variant such that discard the starting word if it is ADJ/ADV and within CW_50 Actions Task #1975: Check why the sim_score of KP "stderr" is less as compared to "stdout" Actions Task #1976: Increase the sim_score of KP "dst" in Lib Ref Actions Task #2018: Multiple purple links should indeed be one purple link Actions Task #2023: 'dictionary' should be a dark purple link Actions Task #2046: Provide the top 10% of keyphrases by occurrence count for Lib ref, Tutorial, C-API Actions Task #2064: Add occurrence % to BR2 line Actions Task #2081: Dont show 'whitespace' as dark purple because the current document is a better result Actions Task #2094: Tweak 2604 formula for occurrence count Actions Task #2113: Enable creating annotated file with and without occurrence count Actions Task #2221: Treat singular and plural forms of ANPs as duplicates Actions Task #2223: Change proximity threshold from 50 words to 250 words Actions Task #2224: Maintain efficiency although we add brown links for duplicates Actions Task #2257: Copy definitions line to definitions-proc line Actions Task #2259: Use processed definition strings to find similar documents Actions Task #2371: Potential KPs ('immutable' within 'immutable sequence') that are part BRIR need to be given higher score Actions Task #2372: Reduce the impact of fullness ratio Actions